[HBASE-15031] Fix merge of MVCC and SequenceID performance regression in branch-1.0 for Increments - ASF JIRA

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.0.3
Fix Version/s: 1.0.3, 1.1.3
Component/s: Performance
Labels:
None

Hadoop Flags:

Reviewed
Release Note:

Hide
Increments can be 10x slower (or more) when there is high concurrency since HBase 1.0.0 (~~HBASE-8763~~).

This 'fix' adds back a fast increment but speed is achieved by relaxing row-level consistency for Increments (only). The default remains the old, slow, consistent Increment behavior.

Set "hbase.increment.fast.but.narrow.consistency" to true in hbase-site.xml to enable 'fast' increments and then rolling restart your cluster. This is a setting the server-side needs to read.

Intermixing fast increment with other Mutations will give indeterminate results; e.g. a Put and Increment against the same Cell will not always give you the result you expect. Fast Increments are consistent unto themselves. A Get with {@link IsolationLevel#READ_UNCOMMITTED} will return the latest increment value or an Increment of an amount zero will do the same (beware doing Get on a cell that has not been incremented yet -- this will return no results).

The difference between fastAndNarrowConsistencyIncrement and slowButConsistentIncrement is that the former holds the row lock until the WAL sync completes; this allows us to reason that there are no other writers afoot when we read the current increment value. In this case we do not need to wait on mvcc reads to catch up to writes before we proceed with the read of the current Increment value, the root of the slowdown seen in ~~HBASE-14460~~. The fast-path also does not wait on mvcc to complete before returning to the client (but the write has been synced and put into memstore before we return).

Also adds a simple performance test tool that will run against existing cluster. It expects the table to be already created (by default it expects the table 'tableName' with a column family 'columnFamilyName'):

{code}
$ ./bin/hbase org.apache.hadoop.hbase.IncrementPerformanceTest
{code]

Configure it by passing -D options. Here are the set below:

2015-12-23 19:33:36,941 INFO [main] hbase.IncrementPerformanceTest: Running test with hbase.zookeeper.quorum=localhost, tableName=tableName, columnFamilyName=columnFamilyName, threadCount=80, incrementCount=10000

... so to set the tableName pass -DtableName=SOME_TABLENAME

Here is an example use of the test tool:

{code}
$ time ./bin/hbase --config ~/conf_hbase org.apache.hadoop.hbase.IncrementPerformanceTest -DincrementCount=50000
{code}

Comparing before and after I have without patch:

2015-12-28 09:58:05,087 INFO [main] hbase.IncrementPerformanceTest: 75th=25.99884175, 95th=38.259990499999994, 99th=46.06327961000003

.. and then with the patch:

2015-12-28 10:07:34,192 INFO [main] hbase.IncrementPerformanceTest: 75th=5.8296235, 95th=9.775977899999997, 99th=17.758502090000025

Show
Increments can be 10x slower (or more) when there is high concurrency since HBase 1.0.0 ( HBASE-8763 ). This 'fix' adds back a fast increment but speed is achieved by relaxing row-level consistency for Increments (only). The default remains the old, slow, consistent Increment behavior. Set "hbase.increment.fast.but.narrow.consistency" to true in hbase-site.xml to enable 'fast' increments and then rolling restart your cluster. This is a setting the server-side needs to read. Intermixing fast increment with other Mutations will give indeterminate results; e.g. a Put and Increment against the same Cell will not always give you the result you expect. Fast Increments are consistent unto themselves. A Get with {@link IsolationLevel#READ_UNCOMMITTED} will return the latest increment value or an Increment of an amount zero will do the same (beware doing Get on a cell that has not been incremented yet -- this will return no results). The difference between fastAndNarrowConsistencyIncrement and slowButConsistentIncrement is that the former holds the row lock until the WAL sync completes; this allows us to reason that there are no other writers afoot when we read the current increment value. In this case we do not need to wait on mvcc reads to catch up to writes before we proceed with the read of the current Increment value, the root of the slowdown seen in HBASE-14460 . The fast-path also does not wait on mvcc to complete before returning to the client (but the write has been synced and put into memstore before we return). Also adds a simple performance test tool that will run against existing cluster. It expects the table to be already created (by default it expects the table 'tableName' with a column family 'columnFamilyName'): {code} $ ./bin/hbase org.apache.hadoop.hbase.IncrementPerformanceTest {code] Configure it by passing -D options. Here are the set below: 2015-12-23 19:33:36,941 INFO [main] hbase.IncrementPerformanceTest: Running test with hbase.zookeeper.quorum=localhost, tableName=tableName, columnFamilyName=columnFamilyName, threadCount=80, incrementCount=10000 ... so to set the tableName pass -DtableName=SOME_TABLENAME Here is an example use of the test tool: {code} $ time ./bin/hbase --config ~/conf_hbase org.apache.hadoop.hbase.IncrementPerformanceTest -DincrementCount=50000 {code} Comparing before and after I have without patch: 2015-12-28 09:58:05,087 INFO [main] hbase.IncrementPerformanceTest: 75th=25.99884175, 95th=38.259990499999994, 99th=46.06327961000003 .. and then with the patch: 2015-12-28 10:07:34,192 INFO [main] hbase.IncrementPerformanceTest: 75th=5.8296235, 95th=9.775977899999997, 99th=17.758502090000025

Description

Subtask with fix for branch-1.0.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

15031.v8.master.patch
03/Jan/16 00:00
82 kB
Michael Stack
15031.v8.branch-1.0.patch
28/Dec/15 20:43
94 kB
Michael Stack
15031.v7.branch-1.0.patch
28/Dec/15 19:34
94 kB
Michael Stack
15031.v6.branch-1.0.patch
26/Dec/15 16:10
93 kB
Michael Stack
15031.v6.branch-1.0.patch
25/Dec/15 01:25
93 kB
Michael Stack
15031.v6.branch-1.0.patch
24/Dec/15 20:06
93 kB
Michael Stack
15031.v5.branch-1.0.patch
24/Dec/15 05:08
93 kB
Michael Stack
15031.v4.branch-1.0.patch
23/Dec/15 07:18
86 kB
Michael Stack
15031.v3.branch-1.0.patch
22/Dec/15 23:22
85 kB
Michael Stack
15031.v2.branch-1.0.patch
22/Dec/15 21:21
66 kB
Michael Stack
14460.v0.branch-1.0.patch
22/Dec/15 18:56
55 kB
Michael Stack

Issue Links

is related to

HBASE-15082 Fix merge of MVCC and SequenceID performance regression

Resolved

links to

Review Board Post

Fix merge of MVCC and SequenceID performance regression in branch-1.0 for Increments

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates