[CASSANDRA-5677] Performance improvements of RangeTombstones/IntervalTree - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Low
Resolution: Fixed
Fix Version/s: 1.2.7, 2.0 beta 2
Component/s: None
Labels:
None

Description

Using massively range tombstones leads to bad response time (ie 100-500 ranges tombstones per row).

After investigation, it seems that the culprit is how the DeletionInfo are merged. Each time a RangeTombstone is added into the DeletionInfo, the whole IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 100 instances of IntervalTree are created, the first one having one interval, the second one 2 intervals,... the 100th one : 100 intervals...)

It seems that once the IntervalTree is built, it is not possible to add a new Interval. Idea is to change the implementation of the IntervalTree by another one which support "insert interval".

Attached is a proposed patch which :

renames the IntervalTree implementation to IntervalTreeCentered (the renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
adds a new implementation IntervalTreeAvl (which is described here : http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : http://en.wikipedia.org/wiki/AVL_tree )
adds a new interface IIntervalTree to abstract the implementation
adds a new configuration option (interval_tree_provider) which allows to choose between the two implementations (defaults to previous IntervalTreeCentered)
updates IntervalTreeTest unit tests to test both implementations
creates a mini benchmark between the two implementations (tree creation, point lookup, interval lookup)
creates a mini benchmark between the two implementations when merging DeletionInfo (which shows a big performance improvement when using 500 tombstones for a row)

This patch applies for 1.2 branch...

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

5677-new-IntervalTree-implementation.patch
20/Jun/13 16:07
87 kB
Fabien Rousseau
5677-1.2.txt
09/Jul/13 16:37
73 kB
Sylvain Lebresne
5677-1.2.overlappingfix.txt
12/Jul/13 09:45
2 kB
Fabien Rousseau

Issue Links

is duplicated by

CASSANDRA-5748 When flushing, nodes spent almost 100% in AbstractCompositeType.compare

Resolved

CASSANDRA-5736 CQL3PagingRecordReader can OOM and kill nodes

Resolved

Activity

People

Assignee:: Sylvain Lebresne

Reporter:: Fabien Rousseau

Authors:: Sylvain Lebresne

Reviewers:: Fabien Rousseau

Votes:: 2 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 20/Jun/13 16:05

Updated:: 16/Apr/19 09:32

Resolved:: 12/Jul/13 14:42