HBase
  1. HBase
  2. HBASE-8806

Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.94.5
    • Fix Version/s: 0.94.10
    • Component/s: regionserver
    • Labels:
      None

      Description

      If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the
      rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows.

      We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys.

      static final int ACQUIRE_LOCK_COUNT = 0;
      
        @Test
        public void testRedundantRowKeys() throws Exception {
      
          final int batchSize = 100000;
          
          String tableName = getClass().getSimpleName();
          Configuration conf = HBaseConfiguration.create();
          conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class);
          MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes("a"));
      
          List<Pair<Mutation, Integer>> someBatch = Lists.newArrayList();
          int i = 0;
          while (i < batchSize) {
            if (i % 2 == 0) {
              someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(0)), null));
            } else {
              someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(1)), null));
            }
            i++;
          }
          long startTime = System.currentTimeMillis();
          region.batchMutate(someBatch.toArray(new Pair[0]));
          long endTime = System.currentTimeMillis();
          long duration = endTime - startTime;
          System.out.println("duration: " + duration + " ms");
          assertEquals(2, ACQUIRE_LOCK_COUNT);
        }
      
        @Override
        public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException {
          ACQUIRE_LOCK_COUNT++;
          return super.getLock(lockid, row, waitForLock);
        }
      
      
      1. row_lock_perf_results.txt
        6 kB
        churro morales
      2. HBASE-8806-threadBasedRowLocks-v2.patch
        43 kB
        Dave Latham
      3. HBASE-8806-threadBasedRowLocks.patch
        44 kB
        Dave Latham
      4. hbase-8806-0.94-microbenchmarks-dupe-rows.txt
        6 kB
        Dave Latham
      5. hbase-8806-0.94-microbenchmark-no-dupe-rows.txt
        5 kB
        Dave Latham
      6. HBASE-8806-0.94.10-v3.patch
        14 kB
        churro morales
      7. HBASE-8806-0.94.10-v2.patch
        8 kB
        churro morales
      8. HBASE-8806-0.94.10.patch
        8 kB
        churro morales
      9. HBASE-8806.patch
        3 kB
        Anoop Sam John
      10. 8806-0.94-v7.txt
        11 kB
        Lars Hofhansl
      11. 8806-0.94-v6.txt
        5 kB
        Lars Hofhansl
      12. 8806-0.94-v5.txt
        5 kB
        Lars Hofhansl
      13. 8806-0.94-v4.txt
        5 kB
        Lars Hofhansl

        Issue Links

          Activity

          churro morales created issue -
          churro morales made changes -
          Field Original Value New Value
          Description If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the
          rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows.

          We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys.

          static final int ACQUIRE_LOCK_COUNT = 0;

            @Test
            public void testRedundantRowKeys() throws Exception {

              final int batchSize = 100000;
              
              String tableName = getClass().getSimpleName();
              Configuration conf = HBaseConfiguration.create();
              conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class);
              MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes("a"));

              List<Pair<Mutation, Integer>> someBatch = Lists.newArrayList();
              int i = 0;
              while (i < batchSize) {
                if (i % 2 == 0) {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(0)), null));
                } else {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(1)), null));
                }
                i++;
              }
              long startTime = System.currentTimeMillis();
              region.batchMutate(someBatch.toArray(new Pair[0]));
              long endTime = System.currentTimeMillis();
              long duration = endTime - startTime;
              System.out.println("duration: " + duration + " ms");
              assertEquals(2, ACQUIRE_LOCK_COUNT); <--- should only try and acquire the lock twice.
            }

            @Override
            public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException {
              ACQUIRE_LOCK_COUNT++;
              return super.getLock(lockid, row, waitForLock);
            }

          If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the
          rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows.

          We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys.

          {code}
          static final int ACQUIRE_LOCK_COUNT = 0;

            @Test
            public void testRedundantRowKeys() throws Exception {

              final int batchSize = 100000;
              
              String tableName = getClass().getSimpleName();
              Configuration conf = HBaseConfiguration.create();
              conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class);
              MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes("a"));

              List<Pair<Mutation, Integer>> someBatch = Lists.newArrayList();
              int i = 0;
              while (i < batchSize) {
                if (i % 2 == 0) {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(0)), null));
                } else {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(1)), null));
                }
                i++;
              }
              long startTime = System.currentTimeMillis();
              region.batchMutate(someBatch.toArray(new Pair[0]));
              long endTime = System.currentTimeMillis();
              long duration = endTime - startTime;
              System.out.println("duration: " + duration + " ms");
              assertEquals(2, ACQUIRE_LOCK_COUNT); <--- should only try and acquire the lock twice.
            }

            @Override
            public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException {
              ACQUIRE_LOCK_COUNT++;
              return super.getLock(lockid, row, waitForLock);
            }

          {/code}
          churro morales made changes -
          Description If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the
          rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows.

          We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys.

          {code}
          static final int ACQUIRE_LOCK_COUNT = 0;

            @Test
            public void testRedundantRowKeys() throws Exception {

              final int batchSize = 100000;
              
              String tableName = getClass().getSimpleName();
              Configuration conf = HBaseConfiguration.create();
              conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class);
              MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes("a"));

              List<Pair<Mutation, Integer>> someBatch = Lists.newArrayList();
              int i = 0;
              while (i < batchSize) {
                if (i % 2 == 0) {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(0)), null));
                } else {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(1)), null));
                }
                i++;
              }
              long startTime = System.currentTimeMillis();
              region.batchMutate(someBatch.toArray(new Pair[0]));
              long endTime = System.currentTimeMillis();
              long duration = endTime - startTime;
              System.out.println("duration: " + duration + " ms");
              assertEquals(2, ACQUIRE_LOCK_COUNT); <--- should only try and acquire the lock twice.
            }

            @Override
            public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException {
              ACQUIRE_LOCK_COUNT++;
              return super.getLock(lockid, row, waitForLock);
            }

          {/code}
          If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the
          rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows.

          We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys.

          {code}
          static final int ACQUIRE_LOCK_COUNT = 0;

            @Test
            public void testRedundantRowKeys() throws Exception {

              final int batchSize = 100000;
              
              String tableName = getClass().getSimpleName();
              Configuration conf = HBaseConfiguration.create();
              conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class);
              MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes("a"));

              List<Pair<Mutation, Integer>> someBatch = Lists.newArrayList();
              int i = 0;
              while (i < batchSize) {
                if (i % 2 == 0) {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(0)), null));
                } else {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(1)), null));
                }
                i++;
              }
              long startTime = System.currentTimeMillis();
              region.batchMutate(someBatch.toArray(new Pair[0]));
              long endTime = System.currentTimeMillis();
              long duration = endTime - startTime;
              System.out.println("duration: " + duration + " ms");
              assertEquals(2, ACQUIRE_LOCK_COUNT); <--- should only try and acquire the lock twice.
            }

            @Override
            public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException {
              ACQUIRE_LOCK_COUNT++;
              return super.getLock(lockid, row, waitForLock);
            }

          {code}
          churro morales made changes -
          Description If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the
          rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows.

          We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys.

          {code}
          static final int ACQUIRE_LOCK_COUNT = 0;

            @Test
            public void testRedundantRowKeys() throws Exception {

              final int batchSize = 100000;
              
              String tableName = getClass().getSimpleName();
              Configuration conf = HBaseConfiguration.create();
              conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class);
              MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes("a"));

              List<Pair<Mutation, Integer>> someBatch = Lists.newArrayList();
              int i = 0;
              while (i < batchSize) {
                if (i % 2 == 0) {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(0)), null));
                } else {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(1)), null));
                }
                i++;
              }
              long startTime = System.currentTimeMillis();
              region.batchMutate(someBatch.toArray(new Pair[0]));
              long endTime = System.currentTimeMillis();
              long duration = endTime - startTime;
              System.out.println("duration: " + duration + " ms");
              assertEquals(2, ACQUIRE_LOCK_COUNT); <--- should only try and acquire the lock twice.
            }

            @Override
            public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException {
              ACQUIRE_LOCK_COUNT++;
              return super.getLock(lockid, row, waitForLock);
            }

          {code}
          If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the
          rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows.

          We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys.

          {code}
          static final int ACQUIRE_LOCK_COUNT = 0;

            @Test
            public void testRedundantRowKeys() throws Exception {

              final int batchSize = 100000;
              
              String tableName = getClass().getSimpleName();
              Configuration conf = HBaseConfiguration.create();
              conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class);
              MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes("a"));

              List<Pair<Mutation, Integer>> someBatch = Lists.newArrayList();
              int i = 0;
              while (i < batchSize) {
                if (i % 2 == 0) {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(0)), null));
                } else {
                  someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(1)), null));
                }
                i++;
              }
              long startTime = System.currentTimeMillis();
              region.batchMutate(someBatch.toArray(new Pair[0]));
              long endTime = System.currentTimeMillis();
              long duration = endTime - startTime;
              System.out.println("duration: " + duration + " ms");
              assertEquals(2, ACQUIRE_LOCK_COUNT);
            }

            @Override
            public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException {
              ACQUIRE_LOCK_COUNT++;
              return super.getLock(lockid, row, waitForLock);
            }

          {code}
          Dave Latham made changes -
          Fix Version/s 0.95.2 [ 12320040 ]
          Fix Version/s 0.94.10 [ 12324627 ]
          Fix Version/s 0.94.0 [ 12316419 ]
          Fix Version/s 0.98.0 [ 12323143 ]
          Fix Version/s 0.95.0 [ 12324094 ]
          churro morales made changes -
          Attachment HBASE-8806-0.94.10.patch [ 12589685 ]
          churro morales made changes -
          Attachment HBASE-8806-0.94.10-v2.patch [ 12589686 ]
          Lars Hofhansl made changes -
          Priority Major [ 3 ] Critical [ 2 ]
          Anoop Sam John made changes -
          Attachment HBASE-8806.patch [ 12590608 ]
          Anoop Sam John made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Dave Latham made changes -
          Attachment HBASE-8806-threadBasedRowLocks.patch [ 12590639 ]
          churro morales made changes -
          Attachment HBASE-8806-0.94.10-v3.patch [ 12590763 ]
          Lars Hofhansl made changes -
          Attachment 8806-0.94-v4.txt [ 12590791 ]
          Lars Hofhansl made changes -
          Attachment 8806-0.94-v4.txt [ 12590791 ]
          Lars Hofhansl made changes -
          Attachment 8806-0.94-v4.txt [ 12590792 ]
          Lars Hofhansl made changes -
          Attachment 8806-0.94-v5.txt [ 12590795 ]
          Dave Latham made changes -
          Attachment HBASE-8806-threadBasedRowLocks-v2.patch [ 12590906 ]
          Lars Hofhansl made changes -
          Attachment 8806-0.94-v6.txt [ 12590912 ]
          Dave Latham made changes -
          Link This issue is related to HBASE-8877 [ HBASE-8877 ]
          Lars Hofhansl made changes -
          Fix Version/s 0.98.0 [ 12323143 ]
          churro morales made changes -
          Attachment row_lock_perf_results.txt [ 12591712 ]
          Dave Latham made changes -
          Attachment hbase-8806-0.94-v6-microbenchmark-no-dupe-rows [ 12592339 ]
          Attachment hbase-8806-0.94-v6-microbenchmark [ 12592340 ]
          Dave Latham made changes -
          Attachment hbase-8806-0.94-v6-microbenchmark [ 12592340 ]
          Dave Latham made changes -
          Attachment hbase-8806-0.94-v6-microbenchmark-no-dupe-rows [ 12592339 ]
          Dave Latham made changes -
          Attachment hbase-8806-0.94-v6-microbenchmark-no-dupe-rows.txt [ 12592341 ]
          Attachment hbase-8806-0.94-v6-microbenchmark.txt [ 12592342 ]
          Dave Latham made changes -
          Dave Latham made changes -
          Attachment hbase-8806-0.94-v6-microbenchmark.txt [ 12592342 ]
          Dave Latham made changes -
          Attachment hbase-8806-0.94-v6-microbenchmark-no-dupe-rows.txt [ 12592341 ]
          Lars Hofhansl made changes -
          Fix Version/s 0.95.2 [ 12320040 ]
          Fix Version/s 0.98.0 [ 12323143 ]
          Lars Hofhansl made changes -
          Assignee Lars Hofhansl [ lhofhansl ]
          Lars Hofhansl made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Lars Hofhansl made changes -
          Attachment 8806-0.94-v7.txt [ 12592634 ]
          Lars Hofhansl made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Lars Hofhansl made changes -
          Assignee Lars Hofhansl [ lhofhansl ] rahul gidwani [ churromorales ]
          Lars Hofhansl made changes -
          Issue Type Bug [ 1 ] Improvement [ 4 ]
          Lars Hofhansl made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              churro morales
              Reporter:
              churro morales
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development