Cassandra
  1. Cassandra
  2. CASSANDRA-3762

AutoSaving KeyCache and System load time improvements.

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.2.0 beta 1
    • Component/s: Core
    • Labels:
      None

      Description

      CASSANDRA-2392 saves the index summary to the disk... but when we have saved cache we will still scan through the index to get the data out.

      We might be able to separate this from SSTR.load and let it load the index summary, once all the SST's are loaded we might be able to check the bloomfilter and do a random IO on fewer Index's to populate the KeyCache.

      1. 0001-SavedKeyCache-load-time-improvements.patch
        5 kB
        Vijay
      2. 0001-CASSANDRA-3762-v6.patch
        40 kB
        Vijay
      3. 0001-CASSANDRA-3762-v5.patch
        31 kB
        Vijay
      4. 0001-CASSANDRA-3762-v4.patch
        31 kB
        Vijay
      5. 0001-CASSANDRA-3762-v3.patch
        31 kB
        Vijay
      6. 0001-CASSANDRA-3762-v2.patch
        25 kB
        Vijay

        Issue Links

          Activity

          Hide
          Vijay added a comment -

          This patch will read the keys from the cache and sort it, so the index scan can be less impact full on the MMaped index.

          Test (Not an extensive test but from my laptop):
          100 Keys to be loaded into cache.

          7G data file and 15M index (long type keys)
          before the patch:
          /var/log/cassandra/system.log:DEBUG [SSTableBatchOpen:4] 2012-01-23 15:10:40,825 SSTableReader.java (line 196) INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace2/Standard3/Keyspace2-Standard3-hc-2777: 850 ms.
          after this patch:
          /var/log/cassandra/system.log:DEBUG [SSTableBatchOpen:4] 2012-01-23 15:10:59,128 SSTableReader.java (line 196) INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace2/Standard3/Keyspace2-Standard3-hc-2777: 177 ms.

          Show
          Vijay added a comment - This patch will read the keys from the cache and sort it, so the index scan can be less impact full on the MMaped index. Test (Not an extensive test but from my laptop): 100 Keys to be loaded into cache. 7G data file and 15M index (long type keys) before the patch: /var/log/cassandra/system.log:DEBUG [SSTableBatchOpen:4] 2012-01-23 15:10:40,825 SSTableReader.java (line 196) INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace2/Standard3/Keyspace2-Standard3-hc-2777: 850 ms. after this patch: /var/log/cassandra/system.log:DEBUG [SSTableBatchOpen:4] 2012-01-23 15:10:59,128 SSTableReader.java (line 196) INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace2/Standard3/Keyspace2-Standard3-hc-2777: 177 ms.
          Hide
          Pavel Yaskevich added a comment -

          With this patch we trade whole sequential primary_index read for random I/O with SSTableReader.getPosition() only for amount saved keys. Can you extend key cache, let's make it 75% of the keys, and run your test again? I think the closer key cache size will get to actual number of keys the worse will performance get...

          Show
          Pavel Yaskevich added a comment - With this patch we trade whole sequential primary_index read for random I/O with SSTableReader.getPosition() only for amount saved keys. Can you extend key cache, let's make it 75% of the keys, and run your test again? I think the closer key cache size will get to actual number of keys the worse will performance get...
          Hide
          Vijay added a comment -

          I dont think it is as bad as it looks.... We aren't doing a lot of random IO because with this patch Keys are sorted and we will read the same blocks often and if it is mmapped it will get the most advantage. Also most of the work load, the keys will not be from the same SST's and 75% of the keys falling into a SST is not that common IMO (If they do they have a bigger problem because all their reads are going to be loger and longer) the load time increases if we have a lot of data in the disk.
          I got around 180ms for 3K keys and thats far is the memory in my laptop

          The other option is to redesign keycache and save the Index location when we store the keys and then look it up and to fault fill the data which are not in the cache via (getPosition).... Makes sense?

          Show
          Vijay added a comment - I dont think it is as bad as it looks.... We aren't doing a lot of random IO because with this patch Keys are sorted and we will read the same blocks often and if it is mmapped it will get the most advantage. Also most of the work load, the keys will not be from the same SST's and 75% of the keys falling into a SST is not that common IMO (If they do they have a bigger problem because all their reads are going to be loger and longer) the load time increases if we have a lot of data in the disk. I got around 180ms for 3K keys and thats far is the memory in my laptop The other option is to redesign keycache and save the Index location when we store the keys and then look it up and to fault fill the data which are not in the cache via (getPosition).... Makes sense?
          Hide
          Pavel Yaskevich added a comment -

          I mention this because the problem in the original ticket was with rolling restarts taking too much time on index summary computation (read going though whole PrimaryIndex for every SSTable out there), so imagine situation when you have few hundreds of SSTables each with key cache in the different parts of the primary index this means if you go with getPosition() calls you will have a lot of random I/O (meaning you will have to seek deeper and deeper into the primary index file which means slower data access even in mmap mode) on each of those and I'm not sure if it's really better than reading primary index sequentially especially knowing that you have already read all of the index/data positions from the Summary component. I propose you do the test with many SSTables and compare system load times (don't forget to drop page cache between tests with `sync; echo 3 > /proc/sys/vm/drop_caches`).

          By the way, I forgot to ask you if you dropped page cache before running second test? if you didn't that would pretty much explain such a dramatic improvement in the load time...

          Show
          Pavel Yaskevich added a comment - I mention this because the problem in the original ticket was with rolling restarts taking too much time on index summary computation (read going though whole PrimaryIndex for every SSTable out there), so imagine situation when you have few hundreds of SSTables each with key cache in the different parts of the primary index this means if you go with getPosition() calls you will have a lot of random I/O (meaning you will have to seek deeper and deeper into the primary index file which means slower data access even in mmap mode) on each of those and I'm not sure if it's really better than reading primary index sequentially especially knowing that you have already read all of the index/data positions from the Summary component. I propose you do the test with many SSTables and compare system load times (don't forget to drop page cache between tests with `sync; echo 3 > /proc/sys/vm/drop_caches`). By the way, I forgot to ask you if you dropped page cache before running second test? if you didn't that would pretty much explain such a dramatic improvement in the load time...
          Hide
          Vijay added a comment -

          >>> By the way, I forgot to ask you if you dropped page cache before running second test?
          No it is basically a average of multiple runs.... Without any additional writes... start and immediatly stop and compare the logs... both tests are the same. Again it is on my laptop.

          Show
          Vijay added a comment - >>> By the way, I forgot to ask you if you dropped page cache before running second test? No it is basically a average of multiple runs.... Without any additional writes... start and immediatly stop and compare the logs... both tests are the same. Again it is on my laptop.
          Hide
          Pavel Yaskevich added a comment -

          No it is basically a average of multiple runs.... Without any additional writes... start and immediatly stop and compare the logs... both tests are the same. Again it is on my laptop.

          Which means the more you run the more data you get cached which affects the results, I would suggest you to drop cache every time you run each of the tests to get cleaner load time values when any I/O is involved.

          Show
          Pavel Yaskevich added a comment - No it is basically a average of multiple runs.... Without any additional writes... start and immediatly stop and compare the logs... both tests are the same. Again it is on my laptop. Which means the more you run the more data you get cached which affects the results, I would suggest you to drop cache every time you run each of the tests to get cleaner load time values when any I/O is involved.
          Hide
          Vijay added a comment -

          That applies to the sequential index scans too (the variables are same both are in-memory it is just 15MB index )... looks like i have to do more extensive test than this one but i am not sure how much is an optimal number i will try loading the keycache over night with stress tool.

          Show
          Vijay added a comment - That applies to the sequential index scans too (the variables are same both are in-memory it is just 15MB index )... looks like i have to do more extensive test than this one but i am not sure how much is an optimal number i will try loading the keycache over night with stress tool.
          Hide
          Vijay added a comment -

          Bigger test with much spread out data showed some mixed results.... Tested on M24XL nodes in AWS, 8M Keys inserted via Stress tool 3 times to get distribution of the keys into the SST's

          Trunk shows:
          DEBUG [SSTableBatchOpen:1] 2012-01-25 09:48:28,098 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/system/LocationInfo/system-LocationInfo-hc-17: 16 ms.
          DEBUG [SSTableBatchOpen:6] 2012-01-25 09:48:47,699 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-637: 4289 ms.
          DEBUG [SSTableBatchOpen:3] 2012-01-25 09:48:51,541 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-607: 8136 ms.
          DEBUG [SSTableBatchOpen:5] 2012-01-25 09:48:52,910 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-429: 9504 ms.
          DEBUG [SSTableBatchOpen:1] 2012-01-25 09:49:07,725 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-611: 24321 ms.
          DEBUG [SSTableBatchOpen:7] 2012-01-25 09:49:29,760 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-534: 46348 ms.
          DEBUG [SSTableBatchOpen:8] 2012-01-25 09:49:34,972 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-462: 51560 ms.
          DEBUG [SSTableBatchOpen:2] 2012-01-25 09:49:35,893 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-636: 52488 ms.
          DEBUG [SSTableBatchOpen:4] 2012-01-25 09:49:45,671 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-635: 62258 ms

          JMX key cache size:
          01/25/2012 09:50:18 +0000 org.archive.jmx.Client KeyCacheSize: 104857584

          After this patch:
          DEBUG [SSTableBatchOpen:6] 2012-01-25 18:22:33,343 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-637: 5731 ms.
          DEBUG [SSTableBatchOpen:6] 2012-01-25 18:22:33,344 SSTableReader.java (line 198) key cache contains 2/2184533 keys
          DEBUG [SSTableBatchOpen:3] 2012-01-25 18:22:34,291 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-607: 6685 ms.
          DEBUG [SSTableBatchOpen:3] 2012-01-25 18:22:34,291 SSTableReader.java (line 198) key cache contains 7619/2184533 keys
          DEBUG [SSTableBatchOpen:1] 2012-01-25 18:22:37,144 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-611: 9541 ms.
          DEBUG [SSTableBatchOpen:1] 2012-01-25 18:22:37,145 SSTableReader.java (line 198) key cache contains 45172/2184533 keys
          DEBUG [SSTableBatchOpen:5] 2012-01-25 18:22:37,275 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-429: 9664 ms.
          DEBUG [SSTableBatchOpen:5] 2012-01-25 18:22:37,276 SSTableReader.java (line 198) key cache contains 48914/2184533 keys
          DEBUG [SSTableBatchOpen:2] 2012-01-25 18:22:42,224 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-636: 14620 ms.
          DEBUG [SSTableBatchOpen:2] 2012-01-25 18:22:42,224 SSTableReader.java (line 198) key cache contains 170841/2184533 keys
          DEBUG [SSTableBatchOpen:4] 2012-01-25 18:22:45,053 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-635: 17442 ms.
          DEBUG [SSTableBatchOpen:4] 2012-01-25 18:22:45,053 SSTableReader.java (line 198) key cache contains 241841/2184533 keys
          DEBUG [SSTableBatchOpen:8] 2012-01-25 18:23:01,720 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-462: 34107 ms.
          DEBUG [SSTableBatchOpen:8] 2012-01-25 18:23:01,720 SSTableReader.java (line 198) key cache contains 689699/2184533 keys
          DEBUG [SSTableBatchOpen:7] 2012-01-25 18:24:19,975 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-534: 112363 ms.
          DEBUG [SSTableBatchOpen:7] 2012-01-25 18:24:19,975 SSTableReader.java (line 198) key cache contains 2184533/2184533 keys

          Sizes of the index:
          1068 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-637-Index.db
          7852 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-429-Index.db
          20044 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-607-Index.db
          84600 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-611-Index.db
          188560 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-534-Index.db
          212204 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-636-Index.db
          217292 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-462-Index.db
          315988 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-635-Index.db

          If we want to see the optimal solution for all the use cases i think we have to go for the alternative where we can save the Keycache position to the disk and read it back and what ever is missing let it fault fill. Agree?

          Show
          Vijay added a comment - Bigger test with much spread out data showed some mixed results.... Tested on M24XL nodes in AWS, 8M Keys inserted via Stress tool 3 times to get distribution of the keys into the SST's Trunk shows: DEBUG [SSTableBatchOpen:1] 2012-01-25 09:48:28,098 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/system/LocationInfo/system-LocationInfo-hc-17: 16 ms. DEBUG [SSTableBatchOpen:6] 2012-01-25 09:48:47,699 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-637: 4289 ms. DEBUG [SSTableBatchOpen:3] 2012-01-25 09:48:51,541 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-607: 8136 ms. DEBUG [SSTableBatchOpen:5] 2012-01-25 09:48:52,910 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-429: 9504 ms. DEBUG [SSTableBatchOpen:1] 2012-01-25 09:49:07,725 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-611: 24321 ms. DEBUG [SSTableBatchOpen:7] 2012-01-25 09:49:29,760 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-534: 46348 ms. DEBUG [SSTableBatchOpen:8] 2012-01-25 09:49:34,972 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-462: 51560 ms. DEBUG [SSTableBatchOpen:2] 2012-01-25 09:49:35,893 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-636: 52488 ms. DEBUG [SSTableBatchOpen:4] 2012-01-25 09:49:45,671 SSTableReader.java (line 193) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-635: 62258 ms JMX key cache size: 01/25/2012 09:50:18 +0000 org.archive.jmx.Client KeyCacheSize: 104857584 After this patch: DEBUG [SSTableBatchOpen:6] 2012-01-25 18:22:33,343 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-637: 5731 ms. DEBUG [SSTableBatchOpen:6] 2012-01-25 18:22:33,344 SSTableReader.java (line 198) key cache contains 2/2184533 keys DEBUG [SSTableBatchOpen:3] 2012-01-25 18:22:34,291 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-607: 6685 ms. DEBUG [SSTableBatchOpen:3] 2012-01-25 18:22:34,291 SSTableReader.java (line 198) key cache contains 7619/2184533 keys DEBUG [SSTableBatchOpen:1] 2012-01-25 18:22:37,144 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-611: 9541 ms. DEBUG [SSTableBatchOpen:1] 2012-01-25 18:22:37,145 SSTableReader.java (line 198) key cache contains 45172/2184533 keys DEBUG [SSTableBatchOpen:5] 2012-01-25 18:22:37,275 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-429: 9664 ms. DEBUG [SSTableBatchOpen:5] 2012-01-25 18:22:37,276 SSTableReader.java (line 198) key cache contains 48914/2184533 keys DEBUG [SSTableBatchOpen:2] 2012-01-25 18:22:42,224 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-636: 14620 ms. DEBUG [SSTableBatchOpen:2] 2012-01-25 18:22:42,224 SSTableReader.java (line 198) key cache contains 170841/2184533 keys DEBUG [SSTableBatchOpen:4] 2012-01-25 18:22:45,053 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-635: 17442 ms. DEBUG [SSTableBatchOpen:4] 2012-01-25 18:22:45,053 SSTableReader.java (line 198) key cache contains 241841/2184533 keys DEBUG [SSTableBatchOpen:8] 2012-01-25 18:23:01,720 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-462: 34107 ms. DEBUG [SSTableBatchOpen:8] 2012-01-25 18:23:01,720 SSTableReader.java (line 198) key cache contains 689699/2184533 keys DEBUG [SSTableBatchOpen:7] 2012-01-25 18:24:19,975 SSTableReader.java (line 195) INDEX LOAD TIME for /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-534: 112363 ms. DEBUG [SSTableBatchOpen:7] 2012-01-25 18:24:19,975 SSTableReader.java (line 198) key cache contains 2184533/2184533 keys Sizes of the index: 1068 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-637-Index.db 7852 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-429-Index.db 20044 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-607-Index.db 84600 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-611-Index.db 188560 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-534-Index.db 212204 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-636-Index.db 217292 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-462-Index.db 315988 /mnt/data/cassandra070/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-635-Index.db If we want to see the optimal solution for all the use cases i think we have to go for the alternative where we can save the Keycache position to the disk and read it back and what ever is missing let it fault fill. Agree?
          Hide
          Pavel Yaskevich added a comment -

          Can you please update your comment with "key cache contains x/y keys" to the results on the current trunk but like you did for "After this patch results"? Btw, did you drop the page cache using `sync; echo 3 > /proc/sys/vm/drop_caches` before running "After time patch" tests?

          Show
          Pavel Yaskevich added a comment - Can you please update your comment with "key cache contains x/y keys" to the results on the current trunk but like you did for "After this patch results"? Btw, did you drop the page cache using `sync; echo 3 > /proc/sys/vm/drop_caches` before running "After time patch" tests?
          Hide
          Vijay added a comment -

          >>> did you drop the page cache
          Yes (I always do that except when i am testing in my laptop).

          >>> Can you please update your comment with "key cache contains x/y keys"
          Why would that matter? Trunk reads the index even if there is 0/0 keys to be loaded.... but for this test the comments for both the load is the same.... (because all i did was kept restarting in between dropping the cache). The above comments are a sample of multiple restarts... They are relative to my environment.

          Show
          Vijay added a comment - >>> did you drop the page cache Yes (I always do that except when i am testing in my laptop). >>> Can you please update your comment with "key cache contains x/y keys" Why would that matter? Trunk reads the index even if there is 0/0 keys to be loaded.... but for this test the comments for both the load is the same.... (because all i did was kept restarting in between dropping the cache). The above comments are a sample of multiple restarts... They are relative to my environment.
          Hide
          Pavel Yaskevich added a comment -

          Why would that matter? Trunk reads the index even if there is 0/0 keys to be loaded.... but for this test the comments for both the load is the same.... (because all i did was kept restarting in between dropping the cache). The above comments are a sample of multiple restarts... They are relative to my environment.

          Because even if it reads the whole index when don't know how key cache pre-load affects performance. Because with your results we can clearly see the thing I was talking about - after this patch, key cache pre-load directly degrades performance the bigger it gets.

          Show
          Pavel Yaskevich added a comment - Why would that matter? Trunk reads the index even if there is 0/0 keys to be loaded.... but for this test the comments for both the load is the same.... (because all i did was kept restarting in between dropping the cache). The above comments are a sample of multiple restarts... They are relative to my environment. Because even if it reads the whole index when don't know how key cache pre-load affects performance. Because with your results we can clearly see the thing I was talking about - after this patch, key cache pre-load directly degrades performance the bigger it gets.
          Hide
          Vijay added a comment -

          I agree

          Copying from my previous comments:
          "If we want to see the optimal solution for all the use cases i think we have to go for the alternative where we can save the Keycache position to the disk and read it back and what ever is missing let it fault fill. Agree?"

          "what ever is missing" => i mean we might have gotten a new SST's which where created from the point we saved the keycache and the restart... those can be fault filled.

          Show
          Vijay added a comment - I agree Copying from my previous comments: "If we want to see the optimal solution for all the use cases i think we have to go for the alternative where we can save the Keycache position to the disk and read it back and what ever is missing let it fault fill. Agree?" "what ever is missing" => i mean we might have gotten a new SST's which where created from the point we saved the keycache and the restart... those can be fault filled.
          Hide
          Pavel Yaskevich added a comment -

          If we want to see the optimal solution for all the use cases i think we have to go for the alternative where we can save the Keycache position to the disk and read it back and what ever is missing let it fault fill. Agree?

          I need to think about this option.

          Show
          Pavel Yaskevich added a comment - If we want to see the optimal solution for all the use cases i think we have to go for the alternative where we can save the Keycache position to the disk and read it back and what ever is missing let it fault fill. Agree? I need to think about this option.
          Hide
          Jonathan Ellis added a comment - - edited

          With this patch we trade whole sequential primary_index read for random I/O with SSTableReader.getPosition() only for amount saved keys.

          I thought Vijay said this sorts the cache first. In which case we're really doing seq i/o, we're just skipping parts that don't have any keys. Right?

          If we want to see the optimal solution for all the use cases i think we have to go for the alternative where we can save the Keycache position to the disk and read it back and what ever is missing let it fault fill.

          I like this idea. If you have a lot of rows (i.e., a large index) then this is the only thing that's going to save you from doing a lot of i/o. Even with seq i/o, reading a small cache will be much faster than scanning a large index.

          The only downside I see is the question of how much churn your sstables will experience between save, and load. If you have a small data set that is constantly being overwritten for instance, you could basically invalidate the whole cache. But, it's quite possible that just reducing cache save period is adequate to address this. So I think we should give this a try.

          Show
          Jonathan Ellis added a comment - - edited With this patch we trade whole sequential primary_index read for random I/O with SSTableReader.getPosition() only for amount saved keys. I thought Vijay said this sorts the cache first. In which case we're really doing seq i/o, we're just skipping parts that don't have any keys. Right? If we want to see the optimal solution for all the use cases i think we have to go for the alternative where we can save the Keycache position to the disk and read it back and what ever is missing let it fault fill. I like this idea. If you have a lot of rows (i.e., a large index) then this is the only thing that's going to save you from doing a lot of i/o. Even with seq i/o, reading a small cache will be much faster than scanning a large index. The only downside I see is the question of how much churn your sstables will experience between save, and load. If you have a small data set that is constantly being overwritten for instance, you could basically invalidate the whole cache. But, it's quite possible that just reducing cache save period is adequate to address this. So I think we should give this a try.
          Hide
          Pavel Yaskevich added a comment -

          It seems like saving cache's data positions (in combination with SSTable index summaries) to the disk to make it independent from the sstable loading is only viable solution we have.

          Show
          Pavel Yaskevich added a comment - It seems like saving cache's data positions (in combination with SSTable index summaries) to the disk to make it independent from the sstable loading is only viable solution we have.
          Hide
          Vijay added a comment -

          Attached patch does the suggested.

          NOTE:
          1) We will take an additional cache hit while saving the data into the cache.
          2) Older SSTable formats will not have cache saved (without promoted index) after the upgrade.
          3) User has to drop the old Caches during the upgrade.

          We should update the Upgrade notes.

          Show
          Vijay added a comment - Attached patch does the suggested. NOTE: 1) We will take an additional cache hit while saving the data into the cache. 2) Older SSTable formats will not have cache saved (without promoted index) after the upgrade. 3) User has to drop the old Caches during the upgrade. We should update the Upgrade notes.
          Hide
          Pavel Yaskevich added a comment -

          We can't require caches drop from users if they want to update to the newer version, imagine if users had a big caches and they would expect them to be warmed up to serve traffic from the beginning (good user experience on system start-up).

          We should support current (old cached keys are saved) and new (key + position in the SSTable) schemas equally. Also I don't think that (de-)serializer logic, as you did it, is really needed because we would have sufficient data inside of the cache file to pre-load cache with one simple algorithm (in pseudo-code):

          For KeyCache:

          while (!eof(cache_file)) {
             cache_entry = deserialize_entry(cache_file)
          
             # we don't want to load into the cache entry where 
             # SSTable to which descriptor corresponds does not exist (compacted, deleted)
             if (exists(cache_entry.key.descriptor))
                cache.put(cache_entry)
          }
          

          For RowCache:

          while (!eof(cache_file)) {
             # as simple as data because we presume that we did cache save as part of the shutdown *(which we should do)*
             cache.put(deserialize_entry(cache_file))
          }
          

          It could be as simple as that because we know how to (de-)serialize RowIndexEntry (that KeyCache uses) as well as ColumnFamily (that RowCache uses), note that we would write/read and (de-)serialize both caches with sequential I/O.

          Show
          Pavel Yaskevich added a comment - We can't require caches drop from users if they want to update to the newer version, imagine if users had a big caches and they would expect them to be warmed up to serve traffic from the beginning (good user experience on system start-up). We should support current (old cached keys are saved) and new (key + position in the SSTable) schemas equally. Also I don't think that (de-)serializer logic, as you did it, is really needed because we would have sufficient data inside of the cache file to pre-load cache with one simple algorithm (in pseudo-code): For KeyCache : while (!eof(cache_file)) { cache_entry = deserialize_entry(cache_file) # we don't want to load into the cache entry where # SSTable to which descriptor corresponds does not exist (compacted, deleted) if (exists(cache_entry.key.descriptor)) cache.put(cache_entry) } For RowCache : while (!eof(cache_file)) { # as simple as data because we presume that we did cache save as part of the shutdown *(which we should do )* cache.put(deserialize_entry(cache_file)) } It could be as simple as that because we know how to (de-)serialize RowIndexEntry (that KeyCache uses) as well as ColumnFamily (that RowCache uses), note that we would write/read and (de-)serialize both caches with sequential I/O.
          Hide
          Vijay added a comment - - edited

          >>> We can't require caches drop from users if they want to update to the newer version, imagine if users had a big caches and they would expect them to be warmed up to serve traffic from the beginning (good user experience on system start-up).
          We can add versioning and support the older cache files and the newer ones we can deserialize an additional byte about the promoted keys.
          I was not sure if it is worth it because moving forward we are going to let the keycache fault fill for the newer SST's, If we feel strongly about not dropping the keys i can add logic to it.

          >>> Also I don't think that (de-)serializer logic, is really needed because we would have sufficient data inside of the cache file to pre-load cache with one simple algorithm (in pseudo-code)
          In the current way Descriptor is just a reference and if you want to construct it is an additional overhead , and in the row cache we dont have the values of the row cache and we have to query ColumnFamilyStore.

          Show
          Vijay added a comment - - edited >>> We can't require caches drop from users if they want to update to the newer version, imagine if users had a big caches and they would expect them to be warmed up to serve traffic from the beginning (good user experience on system start-up). We can add versioning and support the older cache files and the newer ones we can deserialize an additional byte about the promoted keys. I was not sure if it is worth it because moving forward we are going to let the keycache fault fill for the newer SST's, If we feel strongly about not dropping the keys i can add logic to it. >>> Also I don't think that (de-)serializer logic, is really needed because we would have sufficient data inside of the cache file to pre-load cache with one simple algorithm (in pseudo-code) In the current way Descriptor is just a reference and if you want to construct it is an additional overhead , and in the row cache we dont have the values of the row cache and we have to query ColumnFamilyStore.
          Hide
          Jonathan Ellis added a comment -

          I think Pavel is right that we need to give users who rely on the warm cache some kind of solution.

          One alternative is, we could provide a "cache converter" tool that fills in an old-style cache, with position information from the index. In other words basically combining an old-style startup, with a new-style cache save.

          Show
          Jonathan Ellis added a comment - I think Pavel is right that we need to give users who rely on the warm cache some kind of solution. One alternative is, we could provide a "cache converter" tool that fills in an old-style cache, with position information from the index. In other words basically combining an old-style startup, with a new-style cache save.
          Hide
          Vijay added a comment -

          Attached patch doesn't have any of the above limitations added the version (very basic, as the real versioning is controlled by the SSTable itself).

          Show
          Vijay added a comment - Attached patch doesn't have any of the above limitations added the version (very basic, as the real versioning is controlled by the SSTable itself).
          Hide
          Pavel Yaskevich added a comment -

          Why don't you serialize data (payload) of the caches, which is the main goal of this ticket? CacheSerializer should serialize the whole cache entry - key + payload, that way we would minimize the time taken for processing on deserialize (aka when caches are loaded) and ColumnFamilyStore wouldn't be involved in the deserialization process fo the updated cache and you won't have to make methods such as getTopLevelColumns public. Also you forgot to force cache flush upon system shutdown as I wrote in my previous comment.

          Show
          Pavel Yaskevich added a comment - Why don't you serialize data (payload) of the caches, which is the main goal of this ticket? CacheSerializer should serialize the whole cache entry - key + payload, that way we would minimize the time taken for processing on deserialize (aka when caches are loaded) and ColumnFamilyStore wouldn't be involved in the deserialization process fo the updated cache and you won't have to make methods such as getTopLevelColumns public. Also you forgot to force cache flush upon system shutdown as I wrote in my previous comment.
          Hide
          Vijay added a comment -

          >>> Why don't you serialize data (payload) of the caches
          Well the payloads can be in GB's when using Serialized cache and when you are trying to load the old cache you will see inconsistencies between the cache save.
          We cannot rely on shutdown hooks to make the data consistent because (http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread)) OS has timeouts on the shutdown and it doesnt help it when there are power failures. The main problem with the RowCache Values are that they are mutable and Keycache values are immutable.

          The main pain is during startup is keycache, but i dont see why it will be a pain to load the rowcache.

          Show
          Vijay added a comment - >>> Why don't you serialize data (payload) of the caches Well the payloads can be in GB's when using Serialized cache and when you are trying to load the old cache you will see inconsistencies between the cache save. We cannot rely on shutdown hooks to make the data consistent because ( http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread )) OS has timeouts on the shutdown and it doesnt help it when there are power failures. The main problem with the RowCache Values are that they are mutable and Keycache values are immutable. The main pain is during startup is keycache, but i dont see why it will be a pain to load the rowcache.
          Hide
          Jonathan Ellis added a comment - - edited

          Agreed on both counts – I don't think this approach works for row cache, nor do I think row cache is as important to optimize.

          Show
          Jonathan Ellis added a comment - - edited Agreed on both counts – I don't think this approach works for row cache, nor do I think row cache is as important to optimize.
          Hide
          Pavel Yaskevich added a comment -

          I can agree on this point but on the other hand if we have big row cache and would read it using CF that would create a ton of random I/O as well as rearrange key cache that we have preloaded before...

          Show
          Pavel Yaskevich added a comment - I can agree on this point but on the other hand if we have big row cache and would read it using CF that would create a ton of random I/O as well as rearrange key cache that we have preloaded before...
          Hide
          Jonathan Ellis added a comment -

          Agreed, but that's what you're signing up for by enabling row cache saving (same as today). Which is one reason it defaults to disabled.

          Show
          Jonathan Ellis added a comment - Agreed, but that's what you're signing up for by enabling row cache saving (same as today). Which is one reason it defaults to disabled.
          Hide
          Pavel Yaskevich added a comment -

          Indeed but by giving it additional disk space to save payload you could make I/O sequential and make it harmless to key cache. I was thinking saving row cache into different files for keys and payload so in event of shutdown we could just persist keys and on startup we could check timestamps of the files and decide if we need to load it from CF or it is possible to read payload from save...

          Show
          Pavel Yaskevich added a comment - Indeed but by giving it additional disk space to save payload you could make I/O sequential and make it harmless to key cache. I was thinking saving row cache into different files for keys and payload so in event of shutdown we could just persist keys and on startup we could check timestamps of the files and decide if we need to load it from CF or it is possible to read payload from save...
          Hide
          Vijay added a comment -

          That still doesn't grantee that we wont have inconsistent row caches.

          http://stackoverflow.com/questions/8171646/when-shutdown-hooks-break-bad
          kill -9 is a normal way to shut down a Cassandra for us so far.

          Show
          Vijay added a comment - That still doesn't grantee that we wont have inconsistent row caches. http://stackoverflow.com/questions/8171646/when-shutdown-hooks-break-bad kill -9 is a normal way to shut down a Cassandra for us so far.
          Hide
          Pavel Yaskevich added a comment - - edited

          You should probably move to the SIGINT (or SIGTERM) for a graceful shutdown not even for caches but for other services as well (e.g. CASSANDRA-3936). But I'm aware of that problem and I think that if row cache is hopelessly outdated we don't really want to load it the first place (in current situation or new) because it would be just a waste of time...

          Show
          Pavel Yaskevich added a comment - - edited You should probably move to the SIGINT (or SIGTERM) for a graceful shutdown not even for caches but for other services as well (e.g. CASSANDRA-3936 ). But I'm aware of that problem and I think that if row cache is hopelessly outdated we don't really want to load it the first place (in current situation or new) because it would be just a waste of time...
          Hide
          Jonathan Ellis added a comment -

          Let's move the row cache discussion to a separate ticket and keep this one focused on key cache as the title says.

          Show
          Jonathan Ellis added a comment - Let's move the row cache discussion to a separate ticket and keep this one focused on key cache as the title says.
          Hide
          Pavel Yaskevich added a comment -

          Ok, currently I see following problems in the v3

          • KeyCacheSerializer.serializingSize(...) method uses AVERAGE_KEY_CACHE_ROW_SIZE where it should use an actual size of the serialized value.
          • in RowCacheTest.java deleted assert should be moved back because we really want to test the amount of read rows properly.
          • AutoSavingCache.loadSaved(...) method should return if exception occurred while loading old style cache.
          • in the AutoSavingCache.loadSaved(...) deleted debug output should be returned because it's very useful for diagnostic purposes.
          • we should probably make CacheSerialize.serialize method to return size of the serialized data, would be useful upon cache writes instead of explicitly calling serializedSize
          • in the ColumnFamilyStore we really should check if we want to load the cache using (caching == Caching.NONE || caching == Caching.ROWS_ONLY). Right now that check if removed and the cache load is called even if there caching was disabled on the given ColumnFamily, the same also applies for the row cache.
          • the following code could be changed (otherwise we would have ifs each time we add a new element):

          From

          if (version == null)
              return new File(conf.saved_caches_directory + File.separator + ksName + "-" + cfName + "-" + cacheType);
          else
              return new File(conf.saved_caches_directory + File.separator + ksName + "-" + cfName + "-" + cacheType + "-" + version +".db");
          

          To

             return new File(conf.saved_caches_directory + File.separator + ksName + "-" + cfName + "-" + cacheType + "-" + ((version != null) ? version + ".db" : ""));
          
          Show
          Pavel Yaskevich added a comment - Ok, currently I see following problems in the v3 KeyCacheSerializer.serializingSize(...) method uses AVERAGE_KEY_CACHE_ROW_SIZE where it should use an actual size of the serialized value. in RowCacheTest.java deleted assert should be moved back because we really want to test the amount of read rows properly. AutoSavingCache.loadSaved(...) method should return if exception occurred while loading old style cache. in the AutoSavingCache.loadSaved(...) deleted debug output should be returned because it's very useful for diagnostic purposes. we should probably make CacheSerialize.serialize method to return size of the serialized data, would be useful upon cache writes instead of explicitly calling serializedSize in the ColumnFamilyStore we really should check if we want to load the cache using (caching == Caching.NONE || caching == Caching.ROWS_ONLY). Right now that check if removed and the cache load is called even if there caching was disabled on the given ColumnFamily, the same also applies for the row cache. the following code could be changed (otherwise we would have ifs each time we add a new element): From if (version == null) return new File(conf.saved_caches_directory + File.separator + ksName + "-" + cfName + "-" + cacheType); else return new File(conf.saved_caches_directory + File.separator + ksName + "-" + cfName + "-" + cacheType + "-" + version +".db"); To return new File(conf.saved_caches_directory + File.separator + ksName + "-" + cfName + "-" + cacheType + "-" + ((version != null) ? version + ".db" : ""));
          Hide
          Vijay added a comment -

          I dont know why i decided to delete the debug messages

          >>> KeyCacheSerializer.serializingSize(...) method uses AVERAGE_KEY_CACHE_ROW_SIZE where it should use an actual size of the serialized value.
          If we get the real value, this will cause a cache Hit and another cache hit for saving the value. We just need a estimate and we dont need to be accurate IMHO. Added documentation for the same.

          >>> in RowCacheTest.java deleted assert should be moved back because we really want to test the amount of read rows properly.
          Fixed.

          >>> AutoSavingCache.loadSaved(...) method should return if exception occurred while loading old style cache.
          It does

          >>> in the AutoSavingCache.loadSaved(...) deleted debug output should be returned because it's very useful for diagnostic purposes.
          Fixed

          >>> in the ColumnFamilyStore we really should check if we want to load the cache using (caching == Caching.NONE || caching == Caching.ROWS_ONLY). Right now that check if removed and the cache load is called even if there caching was disabled on the given ColumnFamily, the same also applies for the row cache.
          It is harmless but Fixed!

          >>> the following code could be changed (otherwise we would have ifs each time we add a new element):
          Done!

          Show
          Vijay added a comment - I dont know why i decided to delete the debug messages >>> KeyCacheSerializer.serializingSize(...) method uses AVERAGE_KEY_CACHE_ROW_SIZE where it should use an actual size of the serialized value. If we get the real value, this will cause a cache Hit and another cache hit for saving the value. We just need a estimate and we dont need to be accurate IMHO. Added documentation for the same. >>> in RowCacheTest.java deleted assert should be moved back because we really want to test the amount of read rows properly. Fixed. >>> AutoSavingCache.loadSaved(...) method should return if exception occurred while loading old style cache. It does >>> in the AutoSavingCache.loadSaved(...) deleted debug output should be returned because it's very useful for diagnostic purposes. Fixed >>> in the ColumnFamilyStore we really should check if we want to load the cache using (caching == Caching.NONE || caching == Caching.ROWS_ONLY). Right now that check if removed and the cache load is called even if there caching was disabled on the given ColumnFamily, the same also applies for the row cache. It is harmless but Fixed! >>> the following code could be changed (otherwise we would have ifs each time we add a new element): Done!
          Hide
          Pavel Yaskevich added a comment -

          If we get the real value, this will cause a cache Hit and another cache hit for saving the value. We just need a estimate and we dont need to be accurate IMHO. Added documentation for the same.

          It means that the method don't return what what it should and should be renamed to *estimate*SerializedSize but as we would to the same for all of the rows in the cache I don't see how it would harm LRU if we do the real calculation.

          AutoSavingCache.loadSaved(...) method should return if exception occurred while loading old style cache. - It does

          It does not Just uses "logger.warn(String.format("error reading saved cache %s", path.getAbsolutePath()), e);" and would go on with processing, it also should have the same error message as the next "new" case.

          we should probably make CacheSerialize.serialize method to return size of the serialized data, would be useful upon cache writes instead of explicitly calling serializedSize

          What about this one?

          Show
          Pavel Yaskevich added a comment - If we get the real value, this will cause a cache Hit and another cache hit for saving the value. We just need a estimate and we dont need to be accurate IMHO. Added documentation for the same. It means that the method don't return what what it should and should be renamed to *estimate*SerializedSize but as we would to the same for all of the rows in the cache I don't see how it would harm LRU if we do the real calculation. AutoSavingCache.loadSaved(...) method should return if exception occurred while loading old style cache. - It does It does not Just uses "logger.warn(String.format("error reading saved cache %s", path.getAbsolutePath()), e);" and would go on with processing, it also should have the same error message as the next "new" case. we should probably make CacheSerialize.serialize method to return size of the serialized data, would be useful upon cache writes instead of explicitly calling serializedSize What about this one?
          Hide
          Vijay added a comment -

          >>> It means that the method don't return what what it should and should be renamed to *estimate*SerializedSize but as we would to the same for all of the rows in the cache I don't see how it would harm LRU if we do the real calculation.
          If you are getting hot n keys, it will be making hot keys even more hotter.... renamed.

          >>> It does not
          the idea is that the newer cache will delete the older one so it is a no op after that.... anyways fixed it just for the sake of it

          >>> What about this one?
          As we are using estimate not the real value we need to use the serializedSize.... Thats how it was earlier too

          Show
          Vijay added a comment - >>> It means that the method don't return what what it should and should be renamed to *estimate*SerializedSize but as we would to the same for all of the rows in the cache I don't see how it would harm LRU if we do the real calculation. If you are getting hot n keys, it will be making hot keys even more hotter.... renamed. >>> It does not the idea is that the newer cache will delete the older one so it is a no op after that.... anyways fixed it just for the sake of it >>> What about this one? As we are using estimate not the real value we need to use the serializedSize.... Thats how it was earlier too
          Hide
          Jonathan Ellis added a comment -

          Hmm, the size business is a bit tricky.

          I agree that having serialize() return the bytes written is cleaner than retaining a separate [estimated]serializedSize method. (We could let AutoSavingCache wrap its output in a CountingOutputStream that just records the length of what its wrapped stream is given.) However, since we're pretty much stuck w/ an estimate for estimatedTotalBytes anyway, it feels strange to count the same data two different ways.

          But, if we stick with the current design, we're doing three get() calls on each key – not sequentially either, first we get() everything to estimate total, then we get() each in turn twice to serialize and add in size to progress.

          One alternative would be to use an entry set instead of key set in the saving, but CLHM doesn't expose this.

          What if we redefined the problem a bit? Instead of having totalBytes/bytesComplete in CompactionInfo, we could have long total/complete and String units. Then we could just return progress in terms of key count for cache saving.

          Show
          Jonathan Ellis added a comment - Hmm, the size business is a bit tricky. I agree that having serialize() return the bytes written is cleaner than retaining a separate [estimated] serializedSize method. (We could let AutoSavingCache wrap its output in a CountingOutputStream that just records the length of what its wrapped stream is given.) However , since we're pretty much stuck w/ an estimate for estimatedTotalBytes anyway, it feels strange to count the same data two different ways. But, if we stick with the current design, we're doing three get() calls on each key – not sequentially either, first we get() everything to estimate total, then we get() each in turn twice to serialize and add in size to progress. One alternative would be to use an entry set instead of key set in the saving, but CLHM doesn't expose this. What if we redefined the problem a bit? Instead of having totalBytes/bytesComplete in CompactionInfo, we could have long total/complete and String units. Then we could just return progress in terms of key count for cache saving.
          Hide
          Vijay added a comment -

          >>> What if we redefined the problem a bit? Instead of having totalBytes/bytesComplete in CompactionInfo, we could have long total/complete and String units. Then we could just return progress in terms of key count for cache saving.

          Done!
          BTW: In addition there is a minor fix for NPE on CompactionInfo.getID().toString()

          Show
          Vijay added a comment - >>> What if we redefined the problem a bit? Instead of having totalBytes/bytesComplete in CompactionInfo, we could have long total/complete and String units. Then we could just return progress in terms of key count for cache saving. Done! BTW: In addition there is a minor fix for NPE on CompactionInfo.getID().toString()
          Hide
          Jonathan Ellis added a comment -

          LGTM, +1

          Nit:

          public CompactionInfo(UUID id, OperationType tasktype, long bytesComplete, long totalBytes, String unit)

          would be better w/ parameters renamed to

          public CompactionInfo(UUID id, OperationType tasktype, long complete, long total, String unit)

          Show
          Jonathan Ellis added a comment - LGTM, +1 Nit: public CompactionInfo(UUID id, OperationType tasktype, long bytesComplete, long totalBytes, String unit) would be better w/ parameters renamed to public CompactionInfo(UUID id, OperationType tasktype, long complete, long total, String unit)
          Hide
          Pavel Yaskevich added a comment -

          +1 with Jonathan comment and also I think it would be great to add CompactionInfo(OperationType, long complete, long total, String unit) constructor so we don't have to pass "null" as UUID argument (hide implementation detail).

          Show
          Pavel Yaskevich added a comment - +1 with Jonathan comment and also I think it would be great to add CompactionInfo(OperationType, long complete, long total, String unit) constructor so we don't have to pass "null" as UUID argument (hide implementation detail).
          Hide
          Vijay added a comment -

          Committed with the fix! Thanks

          Show
          Vijay added a comment - Committed with the fix! Thanks

            People

            • Assignee:
              Vijay
              Reporter:
              Vijay
              Reviewer:
              Pavel Yaskevich
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development