Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15397

IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

    XMLWordPrintableJSON

    Details

    • Change Category:
      Performance
    • Complexity:
      Normal
    • Platform:
      All
    • Impacts:
      None

      Description

      Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair.

      Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with:

      • Linear Walk.
      • Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval).

      Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree or out performs IntervalTree based search. The cost of IntervalTree construction is also substantial and produces lot of garbage during repairs.

      I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. The x-axis in the graphs is the search interval coverage. 10p means the search interval covered 10% of the intervals. The y-axis is the time the search took in nanos.

      PS:

      1. For the purpose of test, I simplified the IntervalTree by removing the data portion of the interval. Modified the template version (Java generics) to a specialized version.
      2. I used the code from Cassandra version 3.11.
      3. Time in the graph is in nanos.

        Attachments

        1. 90p_100k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        2. 90p_1million_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        3. 90p_250k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        4. 90p_500k_sstables_with_1000_searches.png
          14 kB
          Chandrasekhar Thumuluru
        5. 90p_750k_sstables_with_1000_searches.png
          14 kB
          Chandrasekhar Thumuluru
        6. 95p_10000_SSTable_with_5000_Searches.png
          13 kB
          Chandrasekhar Thumuluru
        7. 95p_100k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        8. 95p_15000_SSTable_with_5000_Searches.png
          15 kB
          Chandrasekhar Thumuluru
        9. 95p_1million_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        10. 95p_20000_SSTable_with_5000_Searches.png
          15 kB
          Chandrasekhar Thumuluru
        11. 95p_25000_SSTable_with_5000_Searches.png
          15 kB
          Chandrasekhar Thumuluru
        12. 95p_250k_sstables_with_1000_searches.png
          14 kB
          Chandrasekhar Thumuluru
        13. 95p_30000_SSTable_with_5000_Searches.png
          14 kB
          Chandrasekhar Thumuluru
        14. 95p_5000_SSTable_with_5000_Searches.png
          14 kB
          Chandrasekhar Thumuluru
        15. 95p_500k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        16. 95p_750k_sstables_with_1000_searches.png
          14 kB
          Chandrasekhar Thumuluru
        17. 99p_10000_SSTable_with_5000_Searches.png
          18 kB
          Chandrasekhar Thumuluru
        18. 99p_100k_sstables_with_1000_searches.png
          16 kB
          Chandrasekhar Thumuluru
        19. 99p_15000_SSTable_with_5000_Searches.png
          17 kB
          Chandrasekhar Thumuluru
        20. 99p_1million_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        21. 99p_20000_SSTable_with_5000_Searches.png
          15 kB
          Chandrasekhar Thumuluru
        22. 99p_25000_SSTable_with_5000_Searches.png
          15 kB
          Chandrasekhar Thumuluru
        23. 99p_250k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        24. 99p_30000_SSTable_with_5000_Searches.png
          16 kB
          Chandrasekhar Thumuluru
        25. 99p_5000_SSTable_with_5000_Searches.png
          16 kB
          Chandrasekhar Thumuluru
        26. 99p_500k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        27. 99p_750k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        28. IntervalList.java
          3 kB
          Chandrasekhar Thumuluru
        29. IntervalListWithElimination.java
          5 kB
          Chandrasekhar Thumuluru
        30. IntervalTreeSimplified.java
          6 kB
          Chandrasekhar Thumuluru
        31. Mean_10000_SSTable_with_5000_Searches.png
          14 kB
          Chandrasekhar Thumuluru
        32. Mean_100k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        33. Mean_15000_SSTable_with_5000_Searches.png
          15 kB
          Chandrasekhar Thumuluru
        34. Mean_1million_sstables_with_1000_searches.png
          14 kB
          Chandrasekhar Thumuluru
        35. Mean_20000_SSTable_with_5000_Searches.png
          15 kB
          Chandrasekhar Thumuluru
        36. Mean_25000_SSTable_with_5000_Searches.png
          15 kB
          Chandrasekhar Thumuluru
        37. Mean_250k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        38. Mean_30000_SSTable_with_5000_Searches.png
          14 kB
          Chandrasekhar Thumuluru
        39. Mean_5000_SSTable_with_5000_Searches.png
          14 kB
          Chandrasekhar Thumuluru
        40. Mean_500k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        41. Mean_750k_sstables_with_1000_searches.png
          15 kB
          Chandrasekhar Thumuluru
        42. replace_intervaltree_with_intervallist.patch
          23 kB
          Chandrasekhar Thumuluru
        43. TESTS-TestSuites.xml.lz4
          25.48 MB
          Chandrasekhar Thumuluru

          Activity

            People

            • Assignee:
              cthumuluru Chandrasekhar Thumuluru
              Reporter:
              cthumuluru Chandrasekhar Thumuluru
              Authors:
              Chandrasekhar Thumuluru
              Reviewers:
              Benedict Elliott Smith
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 40m
                40m