[KUDU-1220] Improve bulk loads from multiple sequential writers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: Public beta
Fix Version/s: None
Component/s: backup, perf
Labels:
- roadmap-candidate

Description

We ran some experiments loading lineitem at scale factor 15k. The 10 nodes cluster (1 master, 9 TS) is equipped with Intel P3700 SSDs, one per TS, dedicated for the WALs. The table is hash-partitioned and set to have 10 tablets per TS.

Our findings :

Reading the bloom filters puts a lot of contention on the block cache. This isn't new, see KUDU-613, but it's now coming up when writing because the SSDs are just really fast.
Kudu performs best when data is inserted in order, but with hash partitioning we end up multiple clients writing simultaneously in different key ranges in each tablet. This becomes a worst case scenario, we have to compact (optimize) the row sets over and over again to put them in order. Even if we were to delay this to the end of the bulk load, we're still taking a hit because we have to look at more and more bloom filters to check if a row currently exists or not.
In the case of an initial bulk load, we know we're not trying to overwrite rows or update them, so all those checks are unnecessary.

Some ideas for improvements:

Obviously, we need a better block cache.
When flushing, we could detect those disjoint set of rows and make sure that maps to row sets that don't cover the gaps. For example, if the MRS has a,b,c,x,y,z then flushing would give us two row sets eg a,b,c and x,y,z instead of one. The danger here is generating too many row sets.
When reading, to have the row set interval tree be smart enough to not send readers into the row set gaps. Again with the same example, let's say we're looking for "m", normally we'd see a row set that's a-z so we'd have to check its bloom filter, but if we could detect that it's actually a-c then x-z then we'd save a check.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

write-pattern.png
28/Oct/15 01:39
32 kB
Todd Lipcon
orderkeys.py
28/Oct/15 01:39
0.2 kB
Todd Lipcon

Issue Links

is related to

KUDU-1952 round-robin block allocation can place all blocks for a given column on one disk

Resolved

KUDU-1370 Implement bulk insert API

Open

KUDU-2326 Support atomic bulk load operation

Open

KUDU-2327 Support atomic swap of tables or partitions

Open

KUDU-1439 Optimization for batch inserts into empty key ranges

Open

KUDU-2929 Don't starve compactions under memory pressure

Resolved

(1 is related to)

Activity

People

Assignee:: Todd Lipcon

Reporter:: Jean-Daniel Cryans

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 20/Oct/15 04:35

Updated:: 01/Jun/20 15:17