[HBASE-2248] Provide new non-copy mechanism to assure atomic reads in get and scan - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: 0.20.3
Fix Version/s: 0.20.4
Component/s: None
Labels:
None

Hadoop Flags:

Incompatible change, Reviewed
Release Note:

Hide
This patch changes the Get code path to instead be a Scan of one row. This means than inserting cells out of timestamp order should work now (tests to verify to follow part of hbase-2294) but also that a delete at an explicit timestamp now overshadows EVEN if the effected cell is put after the delete (The old Get code path did early-out so a subsequent puts would not see the delete).

Show
This patch changes the Get code path to instead be a Scan of one row. This means than inserting cells out of timestamp order should work now (tests to verify to follow part of hbase-2294) but also that a delete at an explicit timestamp now overshadows EVEN if the effected cell is put after the delete (The old Get code path did early-out so a subsequent puts would not see the delete).

Description

~~HBASE-2037~~ introduced a new MemStoreScanner which triggers a ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when starting a scan.

After upgrading to 0.20.3, we noticed a big slowdown in our use of short scans. Some of our data repesent a time series. The data is stored in time series order, MR jobs often insert/update new data at the end of the series, and queries usually have to pick up some or all of the series. These are often scans of 0-100 rows at a time. To load one page, we'll observe about 20 such scans being triggered concurrently, and they take 2 seconds to complete. Doing a thread dump of a region server shows many threads in ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key values to copy it.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

ASF.LICENSE.NOT.GRANTED--HBASE-2248-no-row-locks.txt
15/Apr/10 21:49
4 kB
ryan rawson
ASF.LICENSE.NOT.GRANTED--HBASE-2248-rr-final1.txt
15/Apr/10 01:39
106 kB
ryan rawson
ASF.LICENSE.NOT.GRANTED--HBASE-2248-rr-pre-durability4.txt
14/Apr/10 04:47
126 kB
ryan rawson
ASF.LICENSE.NOT.GRANTED--put_call_graph.png
12/Apr/10 15:26
127 kB
Andrew Kyle Purtell
ASF.LICENSE.NOT.GRANTED--profile.png
12/Apr/10 15:26
183 kB
Andrew Kyle Purtell
HBASE-2248-GetsAsScans3.patch
12/Mar/10 05:31
190 kB
Michael Stack
hbase-2248.txt
05/Mar/10 03:53
5 kB
Todd Lipcon
readownwrites-lost.2.patch
05/Mar/10 03:25
3 kB
Todd Lipcon
readownwrites-lost.patch
05/Mar/10 03:14
3 kB
Todd Lipcon
HBASE-2248.patch
26/Feb/10 12:55
11 kB
Michael Stack
HBASE-2248-demonstrate-previous-impl-bugs.patch
24/Feb/10 17:48
14 kB
Michael Stack
Screen shot 2010-02-23 at 10.33.38 AM.png
23/Feb/10 18:39
70 kB
Michael Stack
hbase-2248.gc
23/Feb/10 18:09
67 kB
Dave Latham
threads.txt
22/Feb/10 23:35
12 kB
Dave Latham

Issue Links

is related to

HBASE-2959 Scanning always starts at the beginning of a row

Closed

HBASE-2294 Enumerate ACID properties of HBase in a well defined spec

Closed

relates to

HBASE-2322 deadlock between put and cacheflusher in 0.20 branch

Closed

HBASE-2249 The PerformanceEvaluation read tests don't take the MemStore into account.

Closed

HBASE-2265 HFile and Memstore should maintain minimum and maximum timestamps

Closed

Activity

People

Assignee:: ryan rawson

Reporter:: Dave Latham

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 22/Feb/10 23:35

Updated:: 12/Oct/12 06:14

Resolved:: 14/Apr/10 21:57