Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-5677

Performance improvements of RangeTombstones/IntervalTree

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 1.2.7, 2.0 beta 2
    • None
    • None

    Description

      Using massively range tombstones leads to bad response time (ie 100-500 ranges tombstones per row).

      After investigation, it seems that the culprit is how the DeletionInfo are merged. Each time a RangeTombstone is added into the DeletionInfo, the whole IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 100 instances of IntervalTree are created, the first one having one interval, the second one 2 intervals,... the 100th one : 100 intervals...)

      It seems that once the IntervalTree is built, it is not possible to add a new Interval. Idea is to change the implementation of the IntervalTree by another one which support "insert interval".

      Attached is a proposed patch which :

      • renames the IntervalTree implementation to IntervalTreeCentered (the renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
      • adds a new implementation IntervalTreeAvl (which is described here : http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : http://en.wikipedia.org/wiki/AVL_tree )
      • adds a new interface IIntervalTree to abstract the implementation
      • adds a new configuration option (interval_tree_provider) which allows to choose between the two implementations (defaults to previous IntervalTreeCentered)
      • updates IntervalTreeTest unit tests to test both implementations
      • creates a mini benchmark between the two implementations (tree creation, point lookup, interval lookup)
      • creates a mini benchmark between the two implementations when merging DeletionInfo (which shows a big performance improvement when using 500 tombstones for a row)

      This patch applies for 1.2 branch...

      Attachments

        1. 5677-1.2.overlappingfix.txt
          2 kB
          Fabien Rousseau
        2. 5677-1.2.txt
          73 kB
          Sylvain Lebresne
        3. 5677-new-IntervalTree-implementation.patch
          87 kB
          Fabien Rousseau

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            slebresne Sylvain Lebresne Assign to me
            frousseau Fabien Rousseau
            Sylvain Lebresne
            Fabien Rousseau
            Votes:
            2 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment