Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-12731

Remove IndexInfo cache from FileIndexInfoRetriever.

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 3.10
    • Component/s: Local Write-Read Paths
    • Labels:
      None

      Description

      Hi guys.
      In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a (potentially very large) IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path.

      After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%).

      // LargePartitionsTest.test_13_4G with cache by array
      INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms
      INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms
      
      // LargePartitionsTest.test_13_4G without cache
      INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms
      INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms
      

      Code is here (based on trunk)

      Heap memory usage during running LargePartitionsTest (except for 8G test) with array cache(original)

      Heap memory usage during running LargePartitionsTest (except for 8G test) without cache

      Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test)

      LargePartitionsTest.test_12_2G SELECTs 1 (ms) SELECTs 2 (ms) Scan (ms)
      Original (array) 62736 48562 41540
      ConcurrentHashMap 47597 30854 18271
      ConcurrentHashMap 2nd trial 44036 26895 17443
      LinkedHashCache (capacity=16, limit=10, fifo) 1st 42668 32165 17323
      LinkedHashCache (capacity=16, limit=10, fifo) 2nd 48863 28066 18053
      LinkedHashCache (capacity=16, limit=16, fifo) 46979 29810 18620
      LinkedHashCache (capacity=16, limit=10, lru) 46456 29749 20311
      No Cache 47579 32480 18337
      No Cache 2nd trial 46534 27670 18700

      Code that I used for this comparison is here. LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
      Scan is a execution time to iterate through the large partition.

      So, in this issue, I'd like to propose to remove IndexInfo cache from FileIndexInfoRetriever to improve the performance on large partitions.

      1. screenshot-2.png
        142 kB
        Yasuharu Goto
      2. screenshot-1.png
        111 kB
        Yasuharu Goto

        Activity

        Hide
        snazy Robert Stupp added a comment -

        Thanks for the patch!

        Committed as 033f2565ef25bc764141dc0fc64088fab6de89c9 to cassandra-3.X

        Show
        snazy Robert Stupp added a comment - Thanks for the patch! Committed as 033f2565ef25bc764141dc0fc64088fab6de89c9 to cassandra-3.X
        Hide
        Yasuharu Yasuharu Goto added a comment -

        Thank you very much for your review and cleaning up, Robert Stupp !

        Show
        Yasuharu Yasuharu Goto added a comment - Thank you very much for your review and cleaning up, Robert Stupp !
        Hide
        snazy Robert Stupp added a comment - - edited

        Nice catch, Yasuharu Goto !
        I did some comparison myself and get similar improvements using the patch you provided so I've setup a branch for CI. The branch also includes a temporary fix for the issue you've also reported as CASSANDRA-12717.

        cassandra-3.X branch testall dtest
        Show
        snazy Robert Stupp added a comment - - edited Nice catch, Yasuharu Goto ! I did some comparison myself and get similar improvements using the patch you provided so I've setup a branch for CI. The branch also includes a temporary fix for the issue you've also reported as CASSANDRA-12717 . cassandra-3.X branch testall dtest
        Show
        Yasuharu Yasuharu Goto added a comment - Patch is here. https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce

          People

          • Assignee:
            Yasuharu Yasuharu Goto
            Reporter:
            Yasuharu Yasuharu Goto
            Reviewer:
            Robert Stupp
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development