Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Fix Version/s: 2.1 beta1
    • Component/s: Core
    • Labels:
      None

      Description

      Changing the row cache to a row+filter cache would make it much more useful. We currently have to warn against using the row cache with wide rows, where the read pattern is typically a peek at the head, but this usecase would be perfect supported by a cache that stored only columns matching the filter.

      Possible implementations:

      • (copout) Cache a single filter per row, and leave the cache key as is
      • Cache a list of filters per row, leaving the cache key as is: this is likely to have some gotchas for weird usage patterns, and it requires the list overheard
      • Change the cache key to "rowkey+filterid": basically ideal, but you need a secondary index to lookup cache entries by rowkey so that you can keep them in sync with the memtable
      • others?
      1. 0001-1956-cache-updates-v0.patch
        36 kB
        Vijay
      2. 0001-commiting-block-cache.patch
        39 kB
        Vijay
      3. 0001-re-factor-row-cache.patch
        38 kB
        Vijay
      4. 0001-row-cache-filter.patch
        106 kB
        Daniel Doubleday
      5. 0002-1956-updates-to-thrift-and-avro-v0.patch
        1 kB
        Vijay
      6. 0002-add-query-cache.patch
        6 kB
        Vijay

        Issue Links

          Activity

          Hide
          Daniel Doubleday added a comment -

          Hi - I had the same idea some time ago (implemented it on 0.6) and ported it to 0.7 (based on r1061873)

          This is not fully tested at all. It's more in a 'it might work' state and nothing is obviously broken. But I thought maybe there's something in there that you like.

          Basic idea:

          • Configure a row filter class for a cf which implements RowCacheFilter
          • The filter is used to
            • check if a query can be served by the cache
            • apply changed to the cached row
            • filter and collate columns

          I implemented only one filter which limits the columns cached (TailRowCacheFilter) because I think this is the hardest filter that makes sense. Other static filters (cache only certain columns, cache a configure slice with start and end ...) would be rather trivial to implement.

          The main problems in case of the TailRowCacheFilter that I am aware of are:

          • handling deletions
          • bounding the row cache without locking

          Rather than keeping track of tombstones in the cache (could not figure out how to do it without locking) deletions are not handled but 'expected' and accounted by caching columns * expected_ts_ratio which is configurable)

          I don't know if my take at limiting the column count in a cached row is working. It involves CF.getEstimatedColumnCount a lot and this becomes quite expensive with large rows. In a simple test with 1M colums and a cache limit of 100k writes to the cache became 30% slower (compared to the identity row cache filter). I added a count cache in CF which eliminates the performance problem. It is not entirely thread safe though (at least clear() but that is never called concurrently).

          Whats missing:

          • migration
          • confidence that it actually works
          Show
          Daniel Doubleday added a comment - Hi - I had the same idea some time ago (implemented it on 0.6) and ported it to 0.7 (based on r1061873) This is not fully tested at all. It's more in a 'it might work' state and nothing is obviously broken. But I thought maybe there's something in there that you like. Basic idea: Configure a row filter class for a cf which implements RowCacheFilter The filter is used to check if a query can be served by the cache apply changed to the cached row filter and collate columns I implemented only one filter which limits the columns cached (TailRowCacheFilter) because I think this is the hardest filter that makes sense. Other static filters (cache only certain columns, cache a configure slice with start and end ...) would be rather trivial to implement. The main problems in case of the TailRowCacheFilter that I am aware of are: handling deletions bounding the row cache without locking Rather than keeping track of tombstones in the cache (could not figure out how to do it without locking) deletions are not handled but 'expected' and accounted by caching columns * expected_ts_ratio which is configurable) I don't know if my take at limiting the column count in a cached row is working. It involves CF.getEstimatedColumnCount a lot and this becomes quite expensive with large rows. In a simple test with 1M colums and a cache limit of 100k writes to the cache became 30% slower (compared to the identity row cache filter). I added a count cache in CF which eliminates the performance problem. It is not entirely thread safe though (at least clear() but that is never called concurrently). Whats missing: migration confidence that it actually works
          Hide
          Jonathan Ellis added a comment -

          Rather than keeping track of tombstones in the cache (could not figure out how to do it without locking) deletions are not handled but 'expected' and accounted by caching columns * expected_ts_ratio which is configurable)

          This seems pretty error prone. What about just invalidating (removing from the cache) the row on delete and letting it get rebuild on the next read? Either way, it means TailRowCacheFilter is not a great fit for workloads w/ lots of deletes (which I am okay with) but at least you don't risk outright invalid answers this way.

          Show
          Jonathan Ellis added a comment - Rather than keeping track of tombstones in the cache (could not figure out how to do it without locking) deletions are not handled but 'expected' and accounted by caching columns * expected_ts_ratio which is configurable) This seems pretty error prone. What about just invalidating (removing from the cache) the row on delete and letting it get rebuild on the next read? Either way, it means TailRowCacheFilter is not a great fit for workloads w/ lots of deletes (which I am okay with) but at least you don't risk outright invalid answers this way.
          Hide
          Daniel Doubleday added a comment -

          Allow filter to invalidate row

          TailRowCacheFilter invalidates cache if a column within its range is deleted

          Show
          Daniel Doubleday added a comment - Allow filter to invalidate row TailRowCacheFilter invalidates cache if a column within its range is deleted
          Hide
          Daniel Doubleday added a comment -

          @Jonathan

          Yes I like the invalidation idea.

          @Stu

          We actually have existing 'filter' implementations (in org.apache.cassandra.db.filter) that I think would make the most sense for use aside cache entries.

          In my implementation the RowCacheFilter has more responsibilities than just filtering the columns. It's more like a plugin interface and filter is probably not the right name for it. It actually uses the db.filter classes.

          I guess I don't understand what you proposed as general mechanism. As I said I had this implementation on 0.6 before and just ported it. My main point was that I think it would be really helpful if the plugin mechanism would allow for custom cache handlers (maybe that would be a better name) with a generic configuration mechanism (filter params in yaml). Because every use case / data model can be very specific.

          Also, regarding the "tombstones in cache" problem: I believe it came up in IRC the other day. The solution that seemed closest to our existing methods was to keep the tombstones in cache, but to add a thread that periodically walked the cache to perform GC (with our existing GC timeout) like we would during compaction.

          I don't understand how that would help me: the TailRowCacheFilter contract is that it can return a specific amount of live columns. These are cached. If one of them gets deleted it would not be able to return a valid response. I tried to cope with that with the expected deletion ratio. But I agree with Jonathon that this is not good enough.

          Again you might have an entirely different solution in mind ...

          Show
          Daniel Doubleday added a comment - @Jonathan Yes I like the invalidation idea. @Stu We actually have existing 'filter' implementations (in org.apache.cassandra.db.filter) that I think would make the most sense for use aside cache entries. In my implementation the RowCacheFilter has more responsibilities than just filtering the columns. It's more like a plugin interface and filter is probably not the right name for it. It actually uses the db.filter classes. I guess I don't understand what you proposed as general mechanism. As I said I had this implementation on 0.6 before and just ported it. My main point was that I think it would be really helpful if the plugin mechanism would allow for custom cache handlers (maybe that would be a better name) with a generic configuration mechanism (filter params in yaml). Because every use case / data model can be very specific. Also, regarding the "tombstones in cache" problem: I believe it came up in IRC the other day. The solution that seemed closest to our existing methods was to keep the tombstones in cache, but to add a thread that periodically walked the cache to perform GC (with our existing GC timeout) like we would during compaction. I don't understand how that would help me: the TailRowCacheFilter contract is that it can return a specific amount of live columns. These are cached. If one of them gets deleted it would not be able to return a valid response. I tried to cope with that with the expected deletion ratio. But I agree with Jonathon that this is not good enough. Again you might have an entirely different solution in mind ...
          Hide
          Stu Hood added a comment -

          > These are cached. If one of them gets deleted it would not be able to return a valid response.
          Ahh, sorry: quite right. Invalidation sounds like the best option there.

          I'll try and review this more closely in the next week, but I'm not sure I like the filter as a configuration option, as opposed to any of the ideas in the summary.

          Show
          Stu Hood added a comment - > These are cached. If one of them gets deleted it would not be able to return a valid response. Ahh, sorry: quite right. Invalidation sounds like the best option there. I'll try and review this more closely in the next week, but I'm not sure I like the filter as a configuration option, as opposed to any of the ideas in the summary.
          Hide
          Daniel Doubleday added a comment -

          Now it's my ahh I think I understood your idea.

          Instead of configuring cache rules you want to cache every filtered request like mysql query cache?

          I dropped the idea because I thought it would be either very restricted to certain query patterns or very complicated to keep in sync with memtables and/or decide whether a query can be served by the cache. Also it might be hard to avoid that the cache is being polluted (analogous to the page cache eviction problem during compaction). It might force the developer to spread the data over multiple CFs according to access pattern which increases memory needs (more memtables, more rows).

          But yes - if you can come up with an automagical cache management that just works that would be obviously nicer!

          PS: If you wan to have a look at the patch: apply to 0.7 r1064192

          Show
          Daniel Doubleday added a comment - Now it's my ahh I think I understood your idea. Instead of configuring cache rules you want to cache every filtered request like mysql query cache? I dropped the idea because I thought it would be either very restricted to certain query patterns or very complicated to keep in sync with memtables and/or decide whether a query can be served by the cache. Also it might be hard to avoid that the cache is being polluted (analogous to the page cache eviction problem during compaction). It might force the developer to spread the data over multiple CFs according to access pattern which increases memory needs (more memtables, more rows). But yes - if you can come up with an automagical cache management that just works that would be obviously nicer! PS: If you wan to have a look at the patch: apply to 0.7 r1064192
          Hide
          Vijay added a comment -

          an alternative approach:

          This is still v0/proto-type once you guys agree on this i will make a fresh version.

          Ledger is mainly a look up map to invalidate the filters when there is new changes.

          Things to do:
          1) Make ledger an object and attach it to the cache instead of extending ASC
          2) More fine grain locks on the ledger

          Show
          Vijay added a comment - an alternative approach: This is still v0/proto-type once you guys agree on this i will make a fresh version. Ledger is mainly a look up map to invalidate the filters when there is new changes. Things to do: 1) Make ledger an object and attach it to the cache instead of extending ASC 2) More fine grain locks on the ledger
          Hide
          Chris Burroughs added a comment -

          Doesn't the QueryCache's ledger also need to have entries removed when entries are evicted from the primary cache?

          Would it be reasonable to have a tunning knob so only rows with more than n columns are filter cached to avoid paying the penalty of storing the row key twice?

          Show
          Chris Burroughs added a comment - Doesn't the QueryCache's ledger also need to have entries removed when entries are evicted from the primary cache? Would it be reasonable to have a tunning knob so only rows with more than n columns are filter cached to avoid paying the penalty of storing the row key twice?
          Hide
          Jonathan Ellis added a comment -

          Do we store the full key twice, or just two references to the same key object? The latter would be negligible IMO. (And asking "How many total columns are in this row?" isn't free, either.)

          Show
          Jonathan Ellis added a comment - Do we store the full key twice, or just two references to the same key object? The latter would be negligible IMO. (And asking "How many total columns are in this row?" isn't free, either.)
          Hide
          Vijay added a comment -

          Hi Chris, i will add evict in the v1....
          Jonathan, yes in short it is just the reference.

          There will also be duplicate data (Columns in the cache) if the query's over lap (which can be further optimized later?)

          Show
          Vijay added a comment - Hi Chris, i will add evict in the v1.... Jonathan, yes in short it is just the reference. There will also be duplicate data (Columns in the cache) if the query's over lap (which can be further optimized later?)
          Hide
          Daniel Doubleday added a comment - - edited

          As I wrote earlier I'm a little sceptical that a query cache like this will be useful in many cases but since there is something going on here and Jonathan asked for a wish list:

          Would you guys consider making the row cache a little more pluggable? This would allow us to maintain custom implementation more easily. Also I think that the core code could benefit as well moving some ifs out of CFS.

          Instead of implementing the control flow in CFS and using the cache as a map you could introduce an RowCache instance that would act more like a service layer like:

          interface RowCache {
          
              // returns filtered rows - ready to serve. reads the row from cfs if necessary.
              ColumnFamily getRow(CFS store, QueryFilter filter, int gcBefore);
              
              // notify the cache of a mutation. it can update or invalidate 
              // most impl will not need the store param but it might come handy in special cases
              void apply(CFS store, DK key, CF cf);
          }    
          

          This way CFS would need no knowledge wether a cache is able to update or only invalidate. And when it invalidates wether it has to invalidate the row or just portions of it. Also there would be no expectation about the internal caching format. The row cache could do whatever it likes.

          In CFS there would be only the cache reference. No distinction between old row cache, query cache, off-heap-cache, my-awesome-very-specialized-cache would be necessary.

          Show
          Daniel Doubleday added a comment - - edited As I wrote earlier I'm a little sceptical that a query cache like this will be useful in many cases but since there is something going on here and Jonathan asked for a wish list: Would you guys consider making the row cache a little more pluggable? This would allow us to maintain custom implementation more easily. Also I think that the core code could benefit as well moving some ifs out of CFS. Instead of implementing the control flow in CFS and using the cache as a map you could introduce an RowCache instance that would act more like a service layer like: interface RowCache { // returns filtered rows - ready to serve. reads the row from cfs if necessary. ColumnFamily getRow(CFS store, QueryFilter filter, int gcBefore); // notify the cache of a mutation. it can update or invalidate // most impl will not need the store param but it might come handy in special cases void apply(CFS store, DK key, CF cf); } This way CFS would need no knowledge wether a cache is able to update or only invalidate. And when it invalidates wether it has to invalidate the row or just portions of it. Also there would be no expectation about the internal caching format. The row cache could do whatever it likes. In CFS there would be only the cache reference. No distinction between old row cache, query cache, off-heap-cache, my-awesome-very-specialized-cache would be necessary.
          Hide
          Vijay added a comment -

          Sure, If no one else has a concern i can do the refractor as a part of this patch...

          Show
          Vijay added a comment - Sure, If no one else has a concern i can do the refractor as a part of this patch...
          Hide
          Vijay added a comment -

          First take on the refactor, i am not really happy with this but thought of sharing it to see if there is any concerns on the same...

          TODO: AutoSaving for query cache.

          Show
          Vijay added a comment - First take on the refactor, i am not really happy with this but thought of sharing it to see if there is any concerns on the same... TODO: AutoSaving for query cache.
          Hide
          Daniel Doubleday added a comment -

          Looking at CASSANDRA-3143 I understand that the consensus seems to be that users should be protected from missconfigurations and that a sane 'works most of the time' solution is the way to go.

          Still I want to note why I do believe that some solution along the lines of your patch is a good idea.
          Obviously I can only speak for us and our usecases but our app can only work with the relevant data set in mem.

          Being able to to optimize caching using the knowledge about data access patterns could significantly improve effeciency.

          Right now we still need first level caching for a lot of use cases which makes life not necessarily easier.

          With the patch custom cache providers could (on a per cf basis) cache partial rows, optimize memory format or even integrate other caching solutions. We could also skip caching entirely for certain queries. This would also be great for maintenance jobs which would otherwise lead to cache thrashing.

          I understand that nobody wants to scare off adopters but all of this would not really change anything for people who want to go with the sane standard. And of course it is also clear that such extension points cannot be guaranteed to be stable.

          Yada yada yada ...

          Thanks for the effort btw!

          Show
          Daniel Doubleday added a comment - Looking at CASSANDRA-3143 I understand that the consensus seems to be that users should be protected from missconfigurations and that a sane 'works most of the time' solution is the way to go. Still I want to note why I do believe that some solution along the lines of your patch is a good idea. Obviously I can only speak for us and our usecases but our app can only work with the relevant data set in mem. Being able to to optimize caching using the knowledge about data access patterns could significantly improve effeciency. Right now we still need first level caching for a lot of use cases which makes life not necessarily easier. With the patch custom cache providers could (on a per cf basis) cache partial rows, optimize memory format or even integrate other caching solutions. We could also skip caching entirely for certain queries. This would also be great for maintenance jobs which would otherwise lead to cache thrashing. I understand that nobody wants to scare off adopters but all of this would not really change anything for people who want to go with the sane standard. And of course it is also clear that such extension points cannot be guaranteed to be stable. Yada yada yada ... Thanks for the effort btw!
          Hide
          Pavel Yaskevich added a comment -

          How about we make row cache value more generic so we can distinguish between fat/thin rows: cache would access a "filter" object which would decide how to be with a row according to it's size (and some other characteristics probably) - should the row be stored directly as ColumnFamily or by slice as hash filter => ColumnFamily or should it start slicing from particular size?... Row cache options could be customized the same way as compaction strategy and compression parameters...

          Show
          Pavel Yaskevich added a comment - How about we make row cache value more generic so we can distinguish between fat/thin rows: cache would access a "filter" object which would decide how to be with a row according to it's size (and some other characteristics probably) - should the row be stored directly as ColumnFamily or by slice as hash filter => ColumnFamily or should it start slicing from particular size?... Row cache options could be customized the same way as compaction strategy and compression parameters...
          Hide
          Pavel Yaskevich added a comment -

          What I propose is - let's make a row cache more configurable according to the way we handle big rows, we can allow cache to take a per-cf "filter" class which would have settings for max row size to cache (and probably some other options in the future) so we can feed it a ColumnFamily from getTopColumns and let it decide if we should just keep that ColumnFamily in cache as is, because it's a thin row, or (if we are querying different parts of the big row) keep QueryFilter and ColumnFamily.

          Show
          Pavel Yaskevich added a comment - What I propose is - let's make a row cache more configurable according to the way we handle big rows, we can allow cache to take a per-cf "filter" class which would have settings for max row size to cache (and probably some other options in the future) so we can feed it a ColumnFamily from getTopColumns and let it decide if we should just keep that ColumnFamily in cache as is, because it's a thin row, or (if we are querying different parts of the big row) keep QueryFilter and ColumnFamily.
          Hide
          Jonathan Ellis added a comment -

          What if we turned it into a real query cache? That way we don't have to predefine filters in the schema, we just use the IFilter objects from the queries involved.

          Show
          Jonathan Ellis added a comment - What if we turned it into a real query cache? That way we don't have to predefine filters in the schema, we just use the IFilter objects from the queries involved.
          Hide
          Pavel Yaskevich added a comment -

          That would work, actually I think that we can slightly modify IFilter to return number of columns involved or use resulting ColumnFamily.serializedSize() to check for threshold, I just want to make sure that cache has minimal memory overhead associated with keeping filters for thin rows.

          Show
          Pavel Yaskevich added a comment - That would work, actually I think that we can slightly modify IFilter to return number of columns involved or use resulting ColumnFamily.serializedSize() to check for threshold, I just want to make sure that cache has minimal memory overhead associated with keeping filters for thin rows.
          Hide
          Vijay added a comment -

          If i understand it correctly, we need a configuration which will tell cache to cache filters only if the returned row is more than x size?

          Show
          Vijay added a comment - If i understand it correctly, we need a configuration which will tell cache to cache filters only if the returned row is more than x size?
          Hide
          Jonathan Ellis added a comment -

          I don't know, is it really worth keeping the old "cache entire rows" behavior around if we have something more sophisticated?

          Show
          Jonathan Ellis added a comment - I don't know, is it really worth keeping the old "cache entire rows" behavior around if we have something more sophisticated?
          Hide
          Jonathan Ellis added a comment -

          ... to answer my own question, "I want to keep an entire CF in memory" is a fairly common request. So maybe the answer is, we support that more directly, as well as the query cache.

          Show
          Jonathan Ellis added a comment - ... to answer my own question, "I want to keep an entire CF in memory" is a fairly common request. So maybe the answer is, we support that more directly, as well as the query cache.
          Hide
          Sylvain Lebresne added a comment -

          ... to answer my own question, "I want to keep an entire CF in memory" is a fairly common request. So maybe the answer is, we support that more directly, as well as the query cache.

          Thinking out loud, but with secondary indexes, we use the trick of expanding the filter to a full row filter if the maxRowSize for the cfs is smaller that some value (the columnIndexSize more specially). Given that keeping entire CF in memory really make sense only for static (narrow) CFs, we could just expand filters for those CFs automatically and just have a query cache as far as the cache implementation is concerned.

          As a side note, I share Daniel's opinion (at least I believe that's what he meant earlier) that a serializing query cache that invalidate on update will be very useful for wide rows. At least I don't see many use cases where it would. However I see a cache that would not invalidate on update but keep the cached data matching the filter be much more useful, even if we start by an on-heap cache. And once we've agreed that we'll evacuate the delete problem by invalidating, I don't think it's too hard to do.

          Show
          Sylvain Lebresne added a comment - ... to answer my own question, "I want to keep an entire CF in memory" is a fairly common request. So maybe the answer is, we support that more directly, as well as the query cache. Thinking out loud, but with secondary indexes, we use the trick of expanding the filter to a full row filter if the maxRowSize for the cfs is smaller that some value (the columnIndexSize more specially). Given that keeping entire CF in memory really make sense only for static (narrow) CFs, we could just expand filters for those CFs automatically and just have a query cache as far as the cache implementation is concerned. As a side note, I share Daniel's opinion (at least I believe that's what he meant earlier) that a serializing query cache that invalidate on update will be very useful for wide rows. At least I don't see many use cases where it would. However I see a cache that would not invalidate on update but keep the cached data matching the filter be much more useful, even if we start by an on-heap cache. And once we've agreed that we'll evacuate the delete problem by invalidating, I don't think it's too hard to do.
          Hide
          Daniel Doubleday added a comment -

          As a side note, I share Daniel's opinion (at least I believe that's what he meant earlier) that a serializing query cache that invalidate on update will be very useful for wide rows.

          If there's a 'not' missing here than yes

          However I see a cache that would not invalidate on update but keep the cached data matching the filter be much more useful

          Just as an implementation idea that could make this easier: If the cache data would be merged with memtables upon read you could merge/write back cache data upon memtable flush which avoids synchronization headaches and might be more efficient (we did something similar in CASSANDRA-2864)

          Show
          Daniel Doubleday added a comment - As a side note, I share Daniel's opinion (at least I believe that's what he meant earlier) that a serializing query cache that invalidate on update will be very useful for wide rows. If there's a 'not' missing here than yes However I see a cache that would not invalidate on update but keep the cached data matching the filter be much more useful Just as an implementation idea that could make this easier: If the cache data would be merged with memtables upon read you could merge/write back cache data upon memtable flush which avoids synchronization headaches and might be more efficient (we did something similar in CASSANDRA-2864 )
          Hide
          Pavel Yaskevich added a comment -

          That is why I propose to combine current technique and filter-data and use first for small rows and latter for wide ones without on-update invalidation. And I agree with Daniel and Sylvain that serializing query cache that invalidate on update won't be very useful in most cases.

          Show
          Pavel Yaskevich added a comment - That is why I propose to combine current technique and filter-data and use first for small rows and latter for wide ones without on-update invalidation. And I agree with Daniel and Sylvain that serializing query cache that invalidate on update won't be very useful in most cases.
          Hide
          Vijay added a comment -

          Cool i will do the following:

          • Make a configurable cache type serializing or inheap.
          • Make x size to be configurable either full row or partial.
          • Add a ranking of Queries
            Example: IdentifyFilter will have the highest rank and any query can go fetch the data from them, in other words compare the query and see if the returned data will be a subset of the cache query and if yes then return those....
            Use the same ranking if it is not an invalidating the cache to update the rows (which can be tricky but i can give it a shot).
          Show
          Vijay added a comment - Cool i will do the following: Make a configurable cache type serializing or inheap. Make x size to be configurable either full row or partial. Add a ranking of Queries Example: IdentifyFilter will have the highest rank and any query can go fetch the data from them, in other words compare the query and see if the returned data will be a subset of the cache query and if yes then return those.... Use the same ranking if it is not an invalidating the cache to update the rows (which can be tricky but i can give it a shot).
          Hide
          Jonathan Ellis added a comment -

          That is why I propose to combine current technique and filter-data and use first for small rows and latter for wide ones

          I'd rather avoid the complexity of keeping both implementations around. If the rows are small enough that keeping the whole thing in memory is the right tradeoff, then users can optimize that themselves by using "select *" instead of "select x" and "select y" (i.e., the former would result in just one cache entry for the row). I suspect it won't matter a great deal in real work scenarios anyway.

          How about this?

          • Query cache replaces row cache, with on/off heap implementations based on existing SC/CLHC. Use CLHM weight feature to rank by query result size.
          • Cache key becomes (row key, query filter)
          • When applying an update to row X, check query cache for filters on X. Update cached CF with the new data for on-heap, invalidate for off-.
          • New ticket for "pin CF in memory" feature
          Show
          Jonathan Ellis added a comment - That is why I propose to combine current technique and filter-data and use first for small rows and latter for wide ones I'd rather avoid the complexity of keeping both implementations around. If the rows are small enough that keeping the whole thing in memory is the right tradeoff, then users can optimize that themselves by using "select *" instead of "select x" and "select y" (i.e., the former would result in just one cache entry for the row). I suspect it won't matter a great deal in real work scenarios anyway. How about this? Query cache replaces row cache, with on/off heap implementations based on existing SC/CLHC. Use CLHM weight feature to rank by query result size. Cache key becomes (row key, query filter) When applying an update to row X, check query cache for filters on X. Update cached CF with the new data for on-heap, invalidate for off-. New ticket for "pin CF in memory" feature
          Hide
          Pavel Yaskevich added a comment -

          I don't say that we should keep both implementations, I suggest we combine them I'm not a big fan of keeping filter for small rows because it creates superfluous overhead.

          Show
          Pavel Yaskevich added a comment - I don't say that we should keep both implementations, I suggest we combine them I'm not a big fan of keeping filter for small rows because it creates superfluous overhead.
          Hide
          Jonathan Ellis added a comment -

          You could be right. I'd still prefer to do have everything go through a single code path first, and then we can evaluate optimization afterwards.

          Show
          Jonathan Ellis added a comment - You could be right. I'd still prefer to do have everything go through a single code path first, and then we can evaluate optimization afterwards.
          Hide
          Daniel Doubleday added a comment -

          users can optimize that themselves by using "select *" instead of "select x" and "select y"

          I guess it's not totally atypical right now to model data in a way that it fits the current caching scheme. I.e. we have UserData rows with around 10 columns and around 50 - 100k row size. All of them are read during one session at a different time. With the proposed new caching scheme we have the choice to either create 10x cache misses and a lot more objects to gc. Or load the entire row everywhere.

          Show
          Daniel Doubleday added a comment - users can optimize that themselves by using "select *" instead of "select x" and "select y" I guess it's not totally atypical right now to model data in a way that it fits the current caching scheme. I.e. we have UserData rows with around 10 columns and around 50 - 100k row size. All of them are read during one session at a different time. With the proposed new caching scheme we have the choice to either create 10x cache misses and a lot more objects to gc. Or load the entire row everywhere.
          Hide
          Jonathan Ellis added a comment -

          So... this proposal would be worse than the status quo for you? I thought this was your idea!

          Show
          Jonathan Ellis added a comment - So... this proposal would be worse than the status quo for you? I thought this was your idea!
          Hide
          Daniel Doubleday added a comment -

          Nice try

          I was Mr I dont believe in magic from the very beginning. I thought that there are some common cases like tail and head caching or exclusion of columns that could be implemented. But never mind - maybe the proposed cache is all greatness. All I'm trying to say is that it's pretty easy to end up in propagation failure hell here or change something else that blows things up for use cases that are not foreseen.

          So to prevent a rude awakening for some users it might be cool to provide some means (config or whatever) that works similar as the current version. Or at least schedule this for a release which allows for a downgrade.

          Just saying ...

          Show
          Daniel Doubleday added a comment - Nice try I was Mr I dont believe in magic from the very beginning. I thought that there are some common cases like tail and head caching or exclusion of columns that could be implemented. But never mind - maybe the proposed cache is all greatness. All I'm trying to say is that it's pretty easy to end up in propagation failure hell here or change something else that blows things up for use cases that are not foreseen. So to prevent a rude awakening for some users it might be cool to provide some means (config or whatever) that works similar as the current version. Or at least schedule this for a release which allows for a downgrade. Just saying ...
          Hide
          Jonathan Ellis added a comment -

          I thought that there are some common cases like tail and head caching or exclusion of columns that could be implemented.

          Right. That's what we want to get out of this – well, the tail/head part, since query cache can only cache what you do ask for, but it can't exclude what you don't. (Although if you never query the excluded column... close enough, right?)

          So to prevent a rude awakening for some users it might be cool to provide some means (config or whatever) that works similar as the current version.

          Good point. Damn it.

          Show
          Jonathan Ellis added a comment - I thought that there are some common cases like tail and head caching or exclusion of columns that could be implemented. Right. That's what we want to get out of this – well, the tail/head part, since query cache can only cache what you do ask for, but it can't exclude what you don't. (Although if you never query the excluded column... close enough, right?) So to prevent a rude awakening for some users it might be cool to provide some means (config or whatever) that works similar as the current version. Good point. Damn it.
          Hide
          Jonathan Ellis added a comment -

          Maybe we should just leave the existing row cache in for 1.2 and deprecate it in favor of query cache, and remove it in 1.3.

          Show
          Jonathan Ellis added a comment - Maybe we should just leave the existing row cache in for 1.2 and deprecate it in favor of query cache, and remove it in 1.3.
          Hide
          Daniel Doubleday added a comment -

          I think that would be great ...

          Although if you never query the excluded column... close enough, right?

          Well maybe you could incorporate hinting as mysql or oracle does in thrift and cql as in

          SELECT * FROM WhatEver IGNORE CACHE (*) WHERE KEY = 'ComesMyWay'
          

          or

          SELECT * FROM WhatEver /*+ nocache(*) */ WHERE KEY = 'ComesMyWay'
          

          That would also prevent cache pollution when you need to run jobs

          Show
          Daniel Doubleday added a comment - I think that would be great ... Although if you never query the excluded column... close enough, right? Well maybe you could incorporate hinting as mysql or oracle does in thrift and cql as in SELECT * FROM WhatEver IGNORE CACHE (*) WHERE KEY = 'ComesMyWay' or SELECT * FROM WhatEver /*+ nocache(*) */ WHERE KEY = 'ComesMyWay' That would also prevent cache pollution when you need to run jobs
          Hide
          Jonathan Ellis added a comment -

          I suppose, but best practice is still going to be to run that kind of job on separate replicas, so that feels pretty low priority to me.

          Show
          Jonathan Ellis added a comment - I suppose, but best practice is still going to be to run that kind of job on separate replicas, so that feels pretty low priority to me.
          Hide
          Sylvain Lebresne added a comment -

          Thinking about this a bit more, I'm not sure I'm convinced by a query cache. Or rather, I think that a cache + filter defined with the schema could be much simpler and imo likely good enough.

          More precisely, with a query cache, when you update a (cached) row, you have to check every queries for that row to see if it should be updated. I'm afraid there will be cases where this will be inefficient and this will put the burden on user to make sure they don't make query that hit those inefficiencies. I'm also really not fan of having of putting the burden on user to query full row if they want it cached fully for equivalent reasons.

          It seems to me that what we want to handle here is exactly the 'cache head or tail of row' problem. If so, it seems to me that simply adding a per-cf (optional) filter to the cache has the following advantages:

          • It handles the head/tail use case, as well as the current cache all row case.
          • You don't have to care about the problems I mention above
          • There is no in-memory overhead of filters problem. We just keep one filter per cf. We can allow more than one filter per-cf in the future if that proves useful, which I'm not even too sure it will.
          • There is no upgrade headache at all, no new cache that use will have to switch to, nothing we'd have to deprecate. No new mental model for the user of how things are cached, just a imho very natural new option of being able to select what part of the row is cached.
          • No question of having two solutions. The current cache will just be the case were there is no filter configured (or the filter is the identity filter, whether "optimize" the no filter case or use the identity filter is really a implementation detail).

          Now the only downside I could see to that compared to a query cache is the fact that you have to define the filter with the schema. I really see this as almost anecdotal. Doesn't seem very complicated (and certainly more simple than having to change your query to make sure what you want is cached) to write something along the line of:

          CREATE TABLE timeline (
              userid uuid,
              timestamp time,
              action text,
              PRIMARY KEY (userid, timestamp)
          ) WITH COMPACT STORAGE AND CACHING FIRST 100;
          

          (note the use of our tentative syntax of CASSANDRA-2474 for wide rows) or even

          CREATE TABLE users (
              userid uuid PRIMARY KEY,
              firstname text,
              lastname text,
              age int,
              email text,
              picture binary,
          ) WITH CACHING (firstname, lastname, email);
          

          if one is so inclined to do that (because he don't want to cache profile pictures for instance).

          Show
          Sylvain Lebresne added a comment - Thinking about this a bit more, I'm not sure I'm convinced by a query cache. Or rather, I think that a cache + filter defined with the schema could be much simpler and imo likely good enough. More precisely, with a query cache, when you update a (cached) row, you have to check every queries for that row to see if it should be updated. I'm afraid there will be cases where this will be inefficient and this will put the burden on user to make sure they don't make query that hit those inefficiencies. I'm also really not fan of having of putting the burden on user to query full row if they want it cached fully for equivalent reasons. It seems to me that what we want to handle here is exactly the 'cache head or tail of row' problem. If so, it seems to me that simply adding a per-cf (optional) filter to the cache has the following advantages: It handles the head/tail use case, as well as the current cache all row case. You don't have to care about the problems I mention above There is no in-memory overhead of filters problem. We just keep one filter per cf. We can allow more than one filter per-cf in the future if that proves useful, which I'm not even too sure it will. There is no upgrade headache at all, no new cache that use will have to switch to, nothing we'd have to deprecate. No new mental model for the user of how things are cached, just a imho very natural new option of being able to select what part of the row is cached. No question of having two solutions. The current cache will just be the case were there is no filter configured (or the filter is the identity filter, whether "optimize" the no filter case or use the identity filter is really a implementation detail). Now the only downside I could see to that compared to a query cache is the fact that you have to define the filter with the schema. I really see this as almost anecdotal. Doesn't seem very complicated (and certainly more simple than having to change your query to make sure what you want is cached) to write something along the line of: CREATE TABLE timeline ( userid uuid, timestamp time, action text, PRIMARY KEY (userid, timestamp) ) WITH COMPACT STORAGE AND CACHING FIRST 100; (note the use of our tentative syntax of CASSANDRA-2474 for wide rows) or even CREATE TABLE users ( userid uuid PRIMARY KEY, firstname text, lastname text, age int, email text, picture binary, ) WITH CACHING (firstname, lastname, email); if one is so inclined to do that (because he don't want to cache profile pictures for instance).
          Hide
          Daniel Doubleday added a comment -

          Well that is pretty much exactly what the initial patch does.

          The only inefficiency as noted above are deletions which are hard to handle without invalidation and now that we have them ttl columns.

          That is when 'CACHING FIRST 100' implies a promise that the user will receive 100 valid cols.
          The easy way out would be to state that 100 includes tombstones.

          Show
          Daniel Doubleday added a comment - Well that is pretty much exactly what the initial patch does. The only inefficiency as noted above are deletions which are hard to handle without invalidation and now that we have them ttl columns. That is when 'CACHING FIRST 100' implies a promise that the user will receive 100 valid cols. The easy way out would be to state that 100 includes tombstones.
          Hide
          Sylvain Lebresne added a comment -

          The only inefficiency as noted above are deletions which are hard to handle without invalidation and now that we have them ttl columns.

          Yes, though I'll note that a query cache have the exact same problem so it's more a problem we have to deal with than anything else. And I'm good by starting with just invalidating for deletion and TTL at first. On a second iteration, I think we could improve the TTL case by keeping for each cached row (at least those associated to a non-identity slice filter) a firstColumnToExpireTimestamp. We would keep that to the smallest localExpirationTime in the cached row. Gets and puts would just invalidate the cache if now >= firstColumnToExpireTimestamp. But again, we don't have to concern ourselves with that initially.

          Show
          Sylvain Lebresne added a comment - The only inefficiency as noted above are deletions which are hard to handle without invalidation and now that we have them ttl columns. Yes, though I'll note that a query cache have the exact same problem so it's more a problem we have to deal with than anything else. And I'm good by starting with just invalidating for deletion and TTL at first. On a second iteration, I think we could improve the TTL case by keeping for each cached row (at least those associated to a non-identity slice filter) a firstColumnToExpireTimestamp. We would keep that to the smallest localExpirationTime in the cached row. Gets and puts would just invalidate the cache if now >= firstColumnToExpireTimestamp. But again, we don't have to concern ourselves with that initially.
          Hide
          Jonathan Ellis added a comment -

          It seems to me that what we want to handle here is exactly the 'cache head or tail of row' problem.

          That, and "I want to cache a specific set of known-ahead-of-time columns [maybe the entire row]," which is what today's row cache is mostly used for.

          I think it's a huge, huge win for a design to be able to handle both of these, without requiring it to be specified in the schema. We've been moving away from hand-tuning, towards self-tuning, for a very good reason: when you require humans to do the right thing to be efficient, you're going to be inefficient an awful lot of the time.

          Show
          Jonathan Ellis added a comment - It seems to me that what we want to handle here is exactly the 'cache head or tail of row' problem. That, and "I want to cache a specific set of known-ahead-of-time columns [maybe the entire row] ," which is what today's row cache is mostly used for. I think it's a huge, huge win for a design to be able to handle both of these, without requiring it to be specified in the schema. We've been moving away from hand-tuning, towards self-tuning, for a very good reason: when you require humans to do the right thing to be efficient, you're going to be inefficient an awful lot of the time.
          Hide
          Sylvain Lebresne added a comment -

          That, and "I want to cache a specific set of known-ahead-of-time columns [maybe the entire row]," which is what today's row cache is mostly used for.

          That is trivially handled by the filter-per-cf approach I'm advocating, contrarily to the query cache solution.

          I think it's a huge, huge win for a design to be able to handle both of these, without requiring it to be specified in the schema.

          Again, I really don't think specifying it in the schema is such a big deal in that case (I insist on the "in that case", I'm not pretending hand-tuning is never a big deal), nor does it feel a hard one to get right.

          Now don't get me wrong, I agree that self-tuning is great, but only if we know how to do it correctly. Typically, and to refer to some ideas above, I think that if users have to think about what query they should do to have good caching (like using select * when really they want select x, y but want to keep the full row in cache, or to be careful that if they use too many different queries for a given row it won't play well with the cache), then 1) it's still hand-tuning and 2) one that is imo far less convenient/intuitive.

          Basically what I'm saying is that with a query cache, I see a number of unknowns, of added difficulties (what about the space taken by all those filter per query? how do we make sure to cache the full row when it's the right thing to do without any user intervention? etc...) and of cases where it will be less efficient that the filter-per-cf alternative unless the user is super careful (will that be a problem in real life ? maybe not, but maybe). On the other side, adding a simple per-cf filter is a nice simple increment over what we have and we stay in known territory while solving the problem we want to solve.

          Besides, if specifying a filter with the schema is that much of a problem, maybe we can do that choice automatically. We have stats on the rows avg and max size, and we can easily start gathering some simple stats on queries, at least enough to be able to say if it's the head or tail that we need to keep in cache for wide rows. Though honestly, even if we do that, my preference would largely go to still allow the user to override whatever automatic choice we came up with if they wish so.

          Show
          Sylvain Lebresne added a comment - That, and "I want to cache a specific set of known-ahead-of-time columns [maybe the entire row] ," which is what today's row cache is mostly used for. That is trivially handled by the filter-per-cf approach I'm advocating, contrarily to the query cache solution. I think it's a huge, huge win for a design to be able to handle both of these, without requiring it to be specified in the schema. Again, I really don't think specifying it in the schema is such a big deal in that case (I insist on the "in that case", I'm not pretending hand-tuning is never a big deal), nor does it feel a hard one to get right. Now don't get me wrong, I agree that self-tuning is great, but only if we know how to do it correctly. Typically, and to refer to some ideas above, I think that if users have to think about what query they should do to have good caching (like using select * when really they want select x, y but want to keep the full row in cache, or to be careful that if they use too many different queries for a given row it won't play well with the cache), then 1) it's still hand-tuning and 2) one that is imo far less convenient/intuitive. Basically what I'm saying is that with a query cache, I see a number of unknowns, of added difficulties (what about the space taken by all those filter per query? how do we make sure to cache the full row when it's the right thing to do without any user intervention? etc...) and of cases where it will be less efficient that the filter-per-cf alternative unless the user is super careful (will that be a problem in real life ? maybe not, but maybe). On the other side, adding a simple per-cf filter is a nice simple increment over what we have and we stay in known territory while solving the problem we want to solve. Besides, if specifying a filter with the schema is that much of a problem, maybe we can do that choice automatically. We have stats on the rows avg and max size, and we can easily start gathering some simple stats on queries, at least enough to be able to say if it's the head or tail that we need to keep in cache for wide rows. Though honestly, even if we do that, my preference would largely go to still allow the user to override whatever automatic choice we came up with if they wish so.
          Hide
          Vijay added a comment -

          How about a query cache which will cache all query's by default
          1) Users can set an upper bound on the size per row (cache rejection handle)
          2) Users can also say cache everything startWith="a" to endWith="x" if the query falls within. (The first time we see see for a row) We will populate the cache with the predefined query and subsequent queries which will fetch results within these limits will served from the cache and the rest will go to the disk.
          3) the existing row cache is nothing but a configuration with startWith="" and endWith="" (everything in a row).

          Makes sense?

          Show
          Vijay added a comment - How about a query cache which will cache all query's by default 1) Users can set an upper bound on the size per row (cache rejection handle) 2) Users can also say cache everything startWith="a" to endWith="x" if the query falls within. (The first time we see see for a row) We will populate the cache with the predefined query and subsequent queries which will fetch results within these limits will served from the cache and the rest will go to the disk. 3) the existing row cache is nothing but a configuration with startWith="" and endWith="" (everything in a row). Makes sense?
          Hide
          Jonathan Ellis added a comment -

          I really don't think specifying it in the schema is such a big deal

          Maybe, but I see a true query cache as being better than the row cache in 90% of situations. It's only an accident of implementation that we did the row cache first. So it feels weird to me to argue to keep it instead of moving to a more flexible model. (One that could accommodate 2ary index queries as well, for instance.)

          Show
          Jonathan Ellis added a comment - I really don't think specifying it in the schema is such a big deal Maybe, but I see a true query cache as being better than the row cache in 90% of situations. It's only an accident of implementation that we did the row cache first. So it feels weird to me to argue to keep it instead of moving to a more flexible model. (One that could accommodate 2ary index queries as well, for instance.)
          Hide
          Vijay added a comment -

          So for this ticket:

          1) Expose rowCache API's so we can extend easier.
          2) Reduce the Query cache memory foot print.
          3) reject rows > x (Configurable)
          4) Writes should not invalidates the cache (Configurable but if not invalidate take some hit on the write performance).

          Reasonable? Anything missing?

          Show
          Vijay added a comment - So for this ticket: 1) Expose rowCache API's so we can extend easier. 2) Reduce the Query cache memory foot print. 3) reject rows > x (Configurable) 4) Writes should not invalidates the cache (Configurable but if not invalidate take some hit on the write performance). Reasonable? Anything missing?
          Hide
          Vijay added a comment - - edited

          This patch is not complete yet i just wanted to show and see what you guys think about this... This patch is something like a block cache where it will cache blocks of columns where the user can choose the block size and if the query is within the block we are good by just pulling the block into memory else we will scan through the blocks and get the required blocks. Updates can also scan through the blocks and update them... The good part here is this should have lower memory foot print than Query cache but it should also solve the problems which we are discussing in this ticket and it doesnt support Super columns and I dont plan to do so. Let me know, Thanks! Again there is more logic/cases to be handled, Just a prototype for now.

          Show
          Vijay added a comment - - edited This patch is not complete yet i just wanted to show and see what you guys think about this... This patch is something like a block cache where it will cache blocks of columns where the user can choose the block size and if the query is within the block we are good by just pulling the block into memory else we will scan through the blocks and get the required blocks. Updates can also scan through the blocks and update them... The good part here is this should have lower memory foot print than Query cache but it should also solve the problems which we are discussing in this ticket and it doesnt support Super columns and I dont plan to do so. Let me know, Thanks! Again there is more logic/cases to be handled, Just a prototype for now.
          Hide
          Sylvain Lebresne added a comment -

          it should also solve the problems which we are discussing in this ticket

          What are those?

          I'd like us to be a little scientific on that issue. What is it we are trying to do in the first place? My take on that (and please feel free to correct me if I'm missing something) is that the kind of caching that I can really see useful in practice are:

          1. Caching a row entirely; that's what we do and I think we agree we should keep that feature because sometimes that's what you want.
          2. Caching the head or the tail of a row for wide rows.
          3. I could also imagine cases where you want to only pin a few columns (by name) into the cache without keeping the row entirely.

          And well, that's it. I try to think of other type of (not far fetched hypothetical) workload where caching could be a notable win but are not handled by the 3 cases above and I don't really find one. Now I apparently am stupid and miss 90% of situations since:

          but I see a true query cache as being better than the row cache in 90% of situations

          because the 3 cases above are perfectly handled by the idea of just adding a filter per-cf to our current row cache (which btw could easily be extended to 2-3 filters per-cf if that proves necessary). So please let's share those cases that are not above and that we want to handle as part of this ticket.

          But if what's above does sum up the problem we want to solve, then I continue to think that simply adding a per-cf filter alongside our current row cache is the best solution:

          • there is no memory overhead.
          • all 3 caching use case above are handled without any drawback that I can think of.
          • it's an incremental change of the existing, not a completely new thing, thus lowering then risk of introducing new bugs. Typically, I can easily see how CASSANDRA-3862 will translate to that solution; but I suspect thing may get more complicated for say a query cache.

          The only criticism that I've seen so far on that solution is the question of the user configuration of the cache, while for the query cache there wouldn't be a configuration (which remains to be proven btw if we want to support the 'stick a row entirely in cache always' case). If someone consider that auto-configuration should be an absolute priority then let's discuss that, because I disagree with that (to sum up, I think any auto-configuration of caches will have drawbacks so I think users should be able to override the default and so I think it's more sane to start with a cache that user can make do what they want and then evaluate how to make that configuration mostly automatic, which I think can be done).

          So before considering other solutions, I'd like to understand first more clearly why we're discarding that per-cf filter idea. Because currently it seems to strike a pretty nice balance of fixing what seems to be the problem versus the added complexity.

          Show
          Sylvain Lebresne added a comment - it should also solve the problems which we are discussing in this ticket What are those? I'd like us to be a little scientific on that issue. What is it we are trying to do in the first place? My take on that (and please feel free to correct me if I'm missing something) is that the kind of caching that I can really see useful in practice are: Caching a row entirely; that's what we do and I think we agree we should keep that feature because sometimes that's what you want. Caching the head or the tail of a row for wide rows. I could also imagine cases where you want to only pin a few columns (by name) into the cache without keeping the row entirely. And well, that's it. I try to think of other type of (not far fetched hypothetical) workload where caching could be a notable win but are not handled by the 3 cases above and I don't really find one. Now I apparently am stupid and miss 90% of situations since: but I see a true query cache as being better than the row cache in 90% of situations because the 3 cases above are perfectly handled by the idea of just adding a filter per-cf to our current row cache (which btw could easily be extended to 2-3 filters per-cf if that proves necessary). So please let's share those cases that are not above and that we want to handle as part of this ticket. But if what's above does sum up the problem we want to solve, then I continue to think that simply adding a per-cf filter alongside our current row cache is the best solution: there is no memory overhead. all 3 caching use case above are handled without any drawback that I can think of. it's an incremental change of the existing, not a completely new thing, thus lowering then risk of introducing new bugs. Typically, I can easily see how CASSANDRA-3862 will translate to that solution; but I suspect thing may get more complicated for say a query cache. The only criticism that I've seen so far on that solution is the question of the user configuration of the cache, while for the query cache there wouldn't be a configuration (which remains to be proven btw if we want to support the 'stick a row entirely in cache always' case). If someone consider that auto-configuration should be an absolute priority then let's discuss that, because I disagree with that (to sum up, I think any auto-configuration of caches will have drawbacks so I think users should be able to override the default and so I think it's more sane to start with a cache that user can make do what they want and then evaluate how to make that configuration mostly automatic, which I think can be done). So before considering other solutions, I'd like to understand first more clearly why we're discarding that per-cf filter idea. Because currently it seems to strike a pretty nice balance of fixing what seems to be the problem versus the added complexity.
          Hide
          Lior Golan added a comment -
          1. Caching a row entirely; that's what we do and I think we agree we should keep that feature because sometimes that's what you want.
          2. Caching the head or the tail of a row for wide rows.
          3. I could also imagine cases where you want to only pin a few columns (by name) into the cache without keeping the row entirely.

          What about secondary indexes? I guess you'd like to be able to cache "hot" columns in secondary indexes CF

          Show
          Lior Golan added a comment - Caching a row entirely; that's what we do and I think we agree we should keep that feature because sometimes that's what you want. Caching the head or the tail of a row for wide rows. I could also imagine cases where you want to only pin a few columns (by name) into the cache without keeping the row entirely. What about secondary indexes? I guess you'd like to be able to cache "hot" columns in secondary indexes CF
          Hide
          Sylvain Lebresne added a comment -

          What about secondary indexes? I guess you'd like to be able to cache "hot" columns in secondary indexes CF

          It's a good question. Secondary indexes CF are accessed from beginning to end, using paging potentially. Which means it's unclear to me if we can do much better than caching the whole secondary index rows. But again, I'm good discussing that, but what I'm mainly saying is that we can't choose the best solution unless we clearly identify the problems we are trying to solve.

          Show
          Sylvain Lebresne added a comment - What about secondary indexes? I guess you'd like to be able to cache "hot" columns in secondary indexes CF It's a good question. Secondary indexes CF are accessed from beginning to end, using paging potentially. Which means it's unclear to me if we can do much better than caching the whole secondary index rows. But again, I'm good discussing that, but what I'm mainly saying is that we can't choose the best solution unless we clearly identify the problems we are trying to solve.
          Hide
          Jonathan Ellis added a comment -

          Right now we have a cache that is only useful for accelerating queries against rows that fit easily into memory, and even then it is inefficient if you only care about part of the row (either name-based or slice-based).

          The filter approach allows us to make slice-based queries more efficient (somewhat clumsily) but doesn't really address the inefficiency for name-based queries. As Lior points out, it also doesn't help with 2I queries, while with a true query cache we could do write-through updates on 2I queries as well ("select * from users where birth_date = 1980"). (This is a fairly straightforward jump to make for queries within a single composite-PK row, and admittedly more complex when spanning multiple physical rows gets involved.)

          Show
          Jonathan Ellis added a comment - Right now we have a cache that is only useful for accelerating queries against rows that fit easily into memory, and even then it is inefficient if you only care about part of the row (either name-based or slice-based). The filter approach allows us to make slice-based queries more efficient (somewhat clumsily) but doesn't really address the inefficiency for name-based queries. As Lior points out, it also doesn't help with 2I queries, while with a true query cache we could do write-through updates on 2I queries as well ("select * from users where birth_date = 1980"). (This is a fairly straightforward jump to make for queries within a single composite-PK row, and admittedly more complex when spanning multiple physical rows gets involved.)
          Hide
          Sylvain Lebresne added a comment -

          The filter approach allows us to make slice-based queries more efficient (somewhat clumsily)

          What is so clumsy?

          but doesn't really address the inefficiency for name-based queries

          Depends on what we're talking. The filter approach would allow to set a name-based filter. But ok, that is less convenient. But the query cache is not perfect either. If you do different name-based query, we will end up caching the same data multiple times. We may be able to optimize this, but then it becomes fairly complicated.

          while with a true query cache we could do write-through updates on 2I queries as well

          I'm not sure I understand, could you clarify your idea?

          Don't get me wrong, I'm not totally closed to the idea of query cache or something alike, but I do want to make sure we don't jump on it without a good reasoning behind, because I do fear a query cache will come with a bunch of complication (and while you may have good reasoning, I personally don't yet see clearly that it's the best choice, so I'll need some convincing). The query cache also has the risk of caching multiple time the same thing. Take a CF on which you do some paging: provided the row receives a few update, we'll end up re-caching the same things multiple times (unless we're really smart about it but I'm pretty sure it's not a simple problem). I'm not sure how much of a problem that'll be in practice but ...

          Then there is also the fact that the way you model in C* is usually with one CF per kind of query. So it does feel like keeping each query separately shouldn't be necessary. But that's not a technical argument.

          Show
          Sylvain Lebresne added a comment - The filter approach allows us to make slice-based queries more efficient (somewhat clumsily) What is so clumsy? but doesn't really address the inefficiency for name-based queries Depends on what we're talking. The filter approach would allow to set a name-based filter. But ok, that is less convenient. But the query cache is not perfect either. If you do different name-based query, we will end up caching the same data multiple times. We may be able to optimize this, but then it becomes fairly complicated. while with a true query cache we could do write-through updates on 2I queries as well I'm not sure I understand, could you clarify your idea? Don't get me wrong, I'm not totally closed to the idea of query cache or something alike, but I do want to make sure we don't jump on it without a good reasoning behind, because I do fear a query cache will come with a bunch of complication (and while you may have good reasoning, I personally don't yet see clearly that it's the best choice, so I'll need some convincing). The query cache also has the risk of caching multiple time the same thing. Take a CF on which you do some paging: provided the row receives a few update, we'll end up re-caching the same things multiple times (unless we're really smart about it but I'm pretty sure it's not a simple problem). I'm not sure how much of a problem that'll be in practice but ... Then there is also the fact that the way you model in C* is usually with one CF per kind of query. So it does feel like keeping each query separately shouldn't be necessary. But that's not a technical argument.
          Hide
          Jonathan Ellis added a comment -

          What is so clumsy?

          First, that you need to explicitly configure it as part of the schema. Second, because it inherently only allows one type of query to be cached. "One CF per query" is the wrong rule of thumb – "One CF per type of resultset" is my preferred one. So for instance, you couldn't cache both oldest entries and newest, from the same row in a CF.

          while with a true query cache we could do write-through updates on 2I queries as well

          What I mean by this is that select * from users where birth_date = 1980 is a query that people could reasonably want to cache, that we can't fit into your 3 categories of "full row, head/tail, handful of named columns."

          At a more sophisticated stage from that, a "true" query cache could update that cached resultsets whenever someone updates the birth_date value to or from 1980, so the query stays fast without having to be recalculated. (We already have the perfect place in the code for this where index maintenance happens in Table.apply.)

          Show
          Jonathan Ellis added a comment - What is so clumsy? First, that you need to explicitly configure it as part of the schema. Second, because it inherently only allows one type of query to be cached. "One CF per query" is the wrong rule of thumb – "One CF per type of resultset" is my preferred one. So for instance, you couldn't cache both oldest entries and newest, from the same row in a CF. while with a true query cache we could do write-through updates on 2I queries as well What I mean by this is that select * from users where birth_date = 1980 is a query that people could reasonably want to cache, that we can't fit into your 3 categories of "full row, head/tail, handful of named columns." At a more sophisticated stage from that, a "true" query cache could update that cached resultsets whenever someone updates the birth_date value to or from 1980, so the query stays fast without having to be recalculated. (We already have the perfect place in the code for this where index maintenance happens in Table.apply.)
          Hide
          Sylvain Lebresne added a comment -

          So for instance, you couldn't cache both oldest entries and newest, from the same row in a CF.

          Well, as I said earlier, that would just require to have a handful of query per-CF rather than just one. Granted, this may blur a little bit the actual complexity difference with respect to a query cache, but it's still likely simpler and with less overhead.

          I think that for a good part what bothers me with a pure query cache is that I think picking what to cache is difficult to do automatically, and looking each query in isolation (which is what a query cache does) is not necessarily the right thing. The typical example is when the right thing to do is to cache the whole row while you'll never query the full row (but maybe on part of the code query the firstname and lastname, another query the email, another the phone). We've mentioned the idea of having the 'cache the full row' as a special case but that doesn't sound very convenient. And it makes me wonder if we won't have the same problem for other situation, where the query cache actually play against you because it just don't see the big picture. While for the user usually know that big picture. In any case, what I meant by my previous comment is that pining a full row into the cache is imho something we should keep (without forcing the user to always query the full row to get that), and that's not handle by a true query cache.

          What I mean by this is that select * from users where birth_date = 1980 is a query that people could reasonably want to cache, that we can't fit into your 3 categories of "full row, head/tail, handful of named columns."

          At a more sophisticated stage from that, a "true" query cache could update that cached resultsets whenever someone updates the birth_date value to or from 1980, so the query stays fast without having to be recalculated. (We already have the perfect place in the code for this where index maintenance happens in Table.apply.)

          That's a good point. I agree that in that case a query cache is what we want, because the query "spans" multiple CF (or in other words the query doesn't map directly to what's on disk but compute the result). But I'm still not sold on the query cache on direct queries of rows, because of the reasons above.

          Show
          Sylvain Lebresne added a comment - So for instance, you couldn't cache both oldest entries and newest, from the same row in a CF. Well, as I said earlier, that would just require to have a handful of query per-CF rather than just one. Granted, this may blur a little bit the actual complexity difference with respect to a query cache, but it's still likely simpler and with less overhead. I think that for a good part what bothers me with a pure query cache is that I think picking what to cache is difficult to do automatically, and looking each query in isolation (which is what a query cache does) is not necessarily the right thing. The typical example is when the right thing to do is to cache the whole row while you'll never query the full row (but maybe on part of the code query the firstname and lastname, another query the email, another the phone). We've mentioned the idea of having the 'cache the full row' as a special case but that doesn't sound very convenient. And it makes me wonder if we won't have the same problem for other situation, where the query cache actually play against you because it just don't see the big picture. While for the user usually know that big picture. In any case, what I meant by my previous comment is that pining a full row into the cache is imho something we should keep (without forcing the user to always query the full row to get that), and that's not handle by a true query cache. What I mean by this is that select * from users where birth_date = 1980 is a query that people could reasonably want to cache, that we can't fit into your 3 categories of "full row, head/tail, handful of named columns." At a more sophisticated stage from that, a "true" query cache could update that cached resultsets whenever someone updates the birth_date value to or from 1980, so the query stays fast without having to be recalculated. (We already have the perfect place in the code for this where index maintenance happens in Table.apply.) That's a good point. I agree that in that case a query cache is what we want, because the query "spans" multiple CF (or in other words the query doesn't map directly to what's on disk but compute the result). But I'm still not sold on the query cache on direct queries of rows, because of the reasons above.
          Hide
          Vijay added a comment -

          Alright, What i was trying to do here is to get the feedback from everyone on all the use cases and try to fit it into the one cache,

          I did some fair amount of research to see if there is any better option and there wasn't one, the closest concept which i got to was something like a block cache or Linux Page cache.... When there is updates to those blocks we can find those and update those.

          1) The problem shows up only when you have a wide row, which means most probably the user is doing a range queries
          2) If the user has a wide row then most probably he has a large number of writes into the row, but if we invalidate the row cache for every updates then it might not be useful and also the first read will have to read multiple SST's.
          3) Lets say user has a 100 columns to query and he queries in this case (specially with composite type columns where the column names can be larger than the value), then we can possibly run into memory pressure.
          4) Having whole row in memory is absolutely required case and we are supporting it (setting min and max number of columns in a block will help it).
          5) the above solution can work seamlessly well for narrow rows when the block size is reasonably big.

          Head and Tail is basically a optimization for the Reverse/Forward queries which is supported if you have 1 M rows and your block size is 500 and your count is 100 and you are reading from reverse.

          Show
          Vijay added a comment - Alright, What i was trying to do here is to get the feedback from everyone on all the use cases and try to fit it into the one cache, I did some fair amount of research to see if there is any better option and there wasn't one, the closest concept which i got to was something like a block cache or Linux Page cache.... When there is updates to those blocks we can find those and update those. 1) The problem shows up only when you have a wide row, which means most probably the user is doing a range queries 2) If the user has a wide row then most probably he has a large number of writes into the row, but if we invalidate the row cache for every updates then it might not be useful and also the first read will have to read multiple SST's. 3) Lets say user has a 100 columns to query and he queries in this case (specially with composite type columns where the column names can be larger than the value), then we can possibly run into memory pressure. 4) Having whole row in memory is absolutely required case and we are supporting it (setting min and max number of columns in a block will help it). 5) the above solution can work seamlessly well for narrow rows when the block size is reasonably big. Head and Tail is basically a optimization for the Reverse/Forward queries which is supported if you have 1 M rows and your block size is 500 and your count is 100 and you are reading from reverse.
          Hide
          B. Todd Burruss added a comment -

          i have very wide rows (140k columns) that i randomly query on, usually about 100 columns at a time. wide rows do not work well with the SerializingCacheProvider because of the constant copying of data. ConcurrentLinkedHashMap performs very well, but eats memory because of all the ByteBuffers.

          I'm trying to understand if this will help my case. head and tail caching will help folks with time series data, but not me. possibly the "handful of named columns" caching will help, but there will be overlap in my queries so columns will exist in multiple cache entries, balooning the cache.

          what i was hoping for was a scheme to segment the wider row into smaller "segments" so not as much copying is performed in the SerializingCacheProvider.

          Show
          B. Todd Burruss added a comment - i have very wide rows (140k columns) that i randomly query on, usually about 100 columns at a time. wide rows do not work well with the SerializingCacheProvider because of the constant copying of data. ConcurrentLinkedHashMap performs very well, but eats memory because of all the ByteBuffers. I'm trying to understand if this will help my case. head and tail caching will help folks with time series data, but not me. possibly the "handful of named columns" caching will help, but there will be overlap in my queries so columns will exist in multiple cache entries, balooning the cache. what i was hoping for was a scheme to segment the wider row into smaller "segments" so not as much copying is performed in the SerializingCacheProvider.
          Hide
          Vijay added a comment -

          Just wanted to make sure to clarify: The proposal is not about head or the tail row's cache, but to have a block cache where blocks of the wide rows are cached (with cname>x to cname<y)... here head and tail is just pure optimization on where to start the block cache from (where to start the block from from the tail or from the head).

          Show
          Vijay added a comment - Just wanted to make sure to clarify: The proposal is not about head or the tail row's cache, but to have a block cache where blocks of the wide rows are cached (with cname>x to cname<y)... here head and tail is just pure optimization on where to start the block cache from (where to start the block from from the tail or from the head).
          Hide
          Jonathan Ellis added a comment -

          how do you decide which blocks to cache?

          Show
          Jonathan Ellis added a comment - how do you decide which blocks to cache?
          Hide
          Vijay added a comment -

          Hi Jonathan, When user requests X1,Y1 we cache the block from (Start->X1-Position ->Y1-Position) + N.... with block size being configurable. (it should similar to the page cache and if there is a write on the block the whole block is marked dirty and next fetch will go to the FS). there is a configurable block size when set high enough will cache the whole row (like the existing cache). The logic around it is kind of what the patch has....

          We can also use column indexes (if needed) and caching data within it, but the simple may be is to start from the column requested and cache blocks (else updating them cache without invalidating the whole row is going to be hard).

          Show
          Vijay added a comment - Hi Jonathan, When user requests X1,Y1 we cache the block from (Start->X1-Position ->Y1-Position) + N.... with block size being configurable. (it should similar to the page cache and if there is a write on the block the whole block is marked dirty and next fetch will go to the FS). there is a configurable block size when set high enough will cache the whole row (like the existing cache). The logic around it is kind of what the patch has.... We can also use column indexes (if needed) and caching data within it, but the simple may be is to start from the column requested and cache blocks (else updating them cache without invalidating the whole row is going to be hard).
          Hide
          Jonathan Ellis added a comment -

          How is that different from the "query cache" I waved my hands about?

          Show
          Jonathan Ellis added a comment - How is that different from the "query cache" I waved my hands about?
          Hide
          Vijay added a comment -

          Quite similar but a different version with less memory foot print, and efficient updates.

          1) For example if there are 10 columns which are queried the key will have those names and in the CF return object;
          2) If we have 2 kinds of queries with over laps (slice and column names) then we will be caching twice or sometimes more in a pure query cache;
          3) If we have an update to a column out of 10 columns.... we have to search to see if they are available in them or invalidate the whole row. in this way we can update the block and be done with it.

          This also allows us to incrementally deserialize some parts of the row when the whole row is cached. *

          Show
          Vijay added a comment - Quite similar but a different version with less memory foot print, and efficient updates. 1) For example if there are 10 columns which are queried the key will have those names and in the CF return object; 2) If we have 2 kinds of queries with over laps (slice and column names) then we will be caching twice or sometimes more in a pure query cache; 3) If we have an update to a column out of 10 columns.... we have to search to see if they are available in them or invalidate the whole row. in this way we can update the block and be done with it. This also allows us to incrementally deserialize some parts of the row when the whole row is cached. *
          Hide
          Jonathan Ellis added a comment -

          Does this support caching head/tail queries? Or do X and Y have to be existing column values?

          Show
          Jonathan Ellis added a comment - Does this support caching head/tail queries? Or do X and Y have to be existing column values?
          Hide
          Jonathan Ellis added a comment -

          Also, it sounds like this always invalidates on update. Would it be possible to preserve the current row cache behavior? I.e., update-in-place if a non-copying cache implementation.

          Show
          Jonathan Ellis added a comment - Also, it sounds like this always invalidates on update. Would it be possible to preserve the current row cache behavior? I.e., update-in-place if a non-copying cache implementation.
          Hide
          Vijay added a comment -

          >>> Does this support caching head/tail queries? Or do X and Y have to be existing column values?
          No X and Y doesn't need to existing, they are just markers in the RowCacheKey (for example if the query has x* -> y* we will have that in the RCK instead of xeon -> yum)... It does support head and tail queries.

          >>> it sounds like this always invalidates on update. Would it be possible to preserve the current row cache behavior?
          Yeah the prototype does the update on write, but the problem is that when there are a lot of updates block size will increase then initially cached, at some point we need to split/re-partition it...

          Show
          Vijay added a comment - >>> Does this support caching head/tail queries? Or do X and Y have to be existing column values? No X and Y doesn't need to existing, they are just markers in the RowCacheKey (for example if the query has x* -> y* we will have that in the RCK instead of xeon -> yum)... It does support head and tail queries. >>> it sounds like this always invalidates on update. Would it be possible to preserve the current row cache behavior? Yeah the prototype does the update on write, but the problem is that when there are a lot of updates block size will increase then initially cached, at some point we need to split/re-partition it...
          Hide
          Jonathan Ellis added a comment -

          This is basically a clunkier implementation of CASSANDRA-5357, right? Should we close it as duplicate?

          Show
          Jonathan Ellis added a comment - This is basically a clunkier implementation of CASSANDRA-5357 , right? Should we close it as duplicate?
          Hide
          Vijay added a comment -

          Yep, Closing this as it is duplicate to CASSANDRA-5357.

          Show
          Vijay added a comment - Yep, Closing this as it is duplicate to CASSANDRA-5357 .

            People

            • Assignee:
              Vijay
              Reporter:
              Stu Hood
              Reviewer:
              Sylvain Lebresne
            • Votes:
              4 Vote for this issue
              Watchers:
              28 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development