Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 1.1.0
    • Component/s: None
    • Labels:
      None

      Description

      While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.

      During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.

      This seems to happen:

      • Reader tries to read row A for the first time doing a getTopLevelColumns
      • Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
      • Reader puts the row in the cache which is now missing the update

      I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.

      The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.

      To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.

      I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

      1. 3862_v3.patch
        13 kB
        Sylvain Lebresne
      2. 3862_v8_addon.txt
        5 kB
        Sylvain Lebresne
      3. 3862.patch
        12 kB
        Sylvain Lebresne
      4. 3862-7.txt
        25 kB
        Jonathan Ellis
      5. 3862-cleanup.txt
        9 kB
        Jonathan Ellis
      6. 3862-v2.patch
        17 kB
        Sylvain Lebresne
      7. 3862-v4.patch
        18 kB
        Sylvain Lebresne
      8. 3862-v5.txt
        24 kB
        Jonathan Ellis
      9. 3862-v6.txt
        26 kB
        Jonathan Ellis
      10. 3862-v8.txt
        31 kB
        Jonathan Ellis
      11. include_memtables_in_rowcache_read.patch
        7 kB
        Daniel Doubleday

        Activity

        Hide
        doubleday Daniel Doubleday added a comment -

        Dunno if there's a better way to do it...

        Show
        doubleday Daniel Doubleday added a comment - Dunno if there's a better way to do it...
        Hide
        slebresne Sylvain Lebresne added a comment -

        I believe you are absolutely right that this is a bug.

        Unfortunately I don't think including the memtables during cache reads really solves it. If you miss an update, it won't ever get added to the cached row, but the update itself will be flushed at some point and thus not be in any memtable anymore.

        One partial solution I see could be that when a read 'reads for caching', it starts adding some sentinel object in the cache for the given row key. That sentinel would need to be an actual (empty) row but marked with the fact it's only a sentinel. When a write look if the row is cache, if it's a sentinel we would add the write to the sentinel. Once the read returns and we actually put the row in cache, we would it (atomically) with the content of the sentinel. A read that check the cache and see a sentinel would just skip the cache (and would not put it's result into the cache). Adapting that to the serializingCache is trivial.

        Unfortunately, this is not perfect because this would screw counters. Though I guess for counters we could do the same thing as we would do for the serializingCache, i.e, if a read that 'reads for caching' see that the sentinel is not empty, we would just not cache the result (i.e, a row would be cache only if we are sure no write were done concurrently to the read).

        Show
        slebresne Sylvain Lebresne added a comment - I believe you are absolutely right that this is a bug. Unfortunately I don't think including the memtables during cache reads really solves it. If you miss an update, it won't ever get added to the cached row, but the update itself will be flushed at some point and thus not be in any memtable anymore. One partial solution I see could be that when a read 'reads for caching', it starts adding some sentinel object in the cache for the given row key. That sentinel would need to be an actual (empty) row but marked with the fact it's only a sentinel. When a write look if the row is cache, if it's a sentinel we would add the write to the sentinel. Once the read returns and we actually put the row in cache, we would it (atomically) with the content of the sentinel. A read that check the cache and see a sentinel would just skip the cache (and would not put it's result into the cache). Adapting that to the serializingCache is trivial. Unfortunately, this is not perfect because this would screw counters. Though I guess for counters we could do the same thing as we would do for the serializingCache, i.e, if a read that 'reads for caching' see that the sentinel is not empty, we would just not cache the result (i.e, a row would be cache only if we are sure no write were done concurrently to the read).
        Hide
        doubleday Daniel Doubleday added a comment -

        If you miss an update, it won't ever get added to the cached row, but the update itself will be flushed at some point and thus not be in any memtable anymore.

        Very true ...

        How about adopting the strategy we apply with CASSANDRA-2864:

        • Writers dont update the cache at all
        • Readers merge cache with memtables
        • Upon flush merge memtables with cache
        Show
        doubleday Daniel Doubleday added a comment - If you miss an update, it won't ever get added to the cached row, but the update itself will be flushed at some point and thus not be in any memtable anymore. Very true ... How about adopting the strategy we apply with CASSANDRA-2864 : Writers dont update the cache at all Readers merge cache with memtables Upon flush merge memtables with cache
        Hide
        slebresne Sylvain Lebresne added a comment -

        How about adopting the strategy we apply with CASSANDRA-2864:

        • Writers dont update the cache at all
        • Readers merge cache with memtables
        • Upon flush merge memtables with cache

        The problem with that is that I don't see how we can make that work for counters at all. I also think it would be nice not having to merge on reads if we can avoid it (even if it's in-memory, it still uses CPU).

        As a side note, I also suspect it's not bulletproof in theory, as a memtable could be fully flushed while a 'read to be cached' happens and with a bad timing during that, we could still miss an update. Of course, that kind of timing have almost no chance to happen. But in the case where a user triggers a flush manually, a memtable with only a handful of columns could be flushed very quickly, and I suspect the behavior could be observed. However unlikely that is, it'd be better if we can fix this problem once and for all.

        I'll probably give a shot to my 'sentinel' proposal described above, I don't think it's too much code.

        Show
        slebresne Sylvain Lebresne added a comment - How about adopting the strategy we apply with CASSANDRA-2864 : Writers dont update the cache at all Readers merge cache with memtables Upon flush merge memtables with cache The problem with that is that I don't see how we can make that work for counters at all. I also think it would be nice not having to merge on reads if we can avoid it (even if it's in-memory, it still uses CPU). As a side note, I also suspect it's not bulletproof in theory, as a memtable could be fully flushed while a 'read to be cached' happens and with a bad timing during that, we could still miss an update. Of course, that kind of timing have almost no chance to happen. But in the case where a user triggers a flush manually, a memtable with only a handful of columns could be flushed very quickly, and I suspect the behavior could be observed. However unlikely that is, it'd be better if we can fix this problem once and for all. I'll probably give a shot to my 'sentinel' proposal described above, I don't think it's too much code.
        Hide
        doubleday Daniel Doubleday added a comment -

        Hmokay ... don't want to abuse Jira as an educational forum but maybe as a reward for the bugreport ... are you saying that a reader could see a memtable view where flushing memtables are gone (flushed) and sstables don't contain the flushed memtables?

        If that's the case than yes the cache would lose an update. But that what also imply that a read could miss an update without caching being in place at all no?

        Otherwise (and that's how I read the code) given that the memtable switch will only happen after the merge the reader will read all updates because they are either in (flushing) memtables or in sstables and the cache will be in fact valid.

        Show
        doubleday Daniel Doubleday added a comment - Hmokay ... don't want to abuse Jira as an educational forum but maybe as a reward for the bugreport ... are you saying that a reader could see a memtable view where flushing memtables are gone (flushed) and sstables don't contain the flushed memtables? If that's the case than yes the cache would lose an update. But that what also imply that a read could miss an update without caching being in place at all no? Otherwise (and that's how I read the code) given that the memtable switch will only happen after the merge the reader will read all updates because they are either in (flushing) memtables or in sstables and the cache will be in fact valid.
        Hide
        slebresne Sylvain Lebresne added a comment -

        To be precise, what I'm saying is that (at least in theory) the following scenario would be possible:

        • A read-for-cache read the memtables grabing updates
        • then it start reading the sstables
        • while the previous happens, a new update arrives. The memtable is then flushed and happens to be fully flushed before our read-for-cache completes.

        In that case, the new update won't be part of the cached row (ever) because during the flush (when we would merge the memtable to the cache) the row was not in the cache yet. That may seem far fetched but consider a simple implementation of you proposition, where the 'upon flush merge memtables with cache' phase happens in the same loop over rows that is used for flushing. It is actually possible for a new write to be "flushed" within a few milliseconds of being received by the node: if the update triggers the memtable threshold and sorts at the very beginning of the memtable. But don't get me wrong, it would probably be possible to deal with that problem, but it feels a bit complicated and error prone.

        Show
        slebresne Sylvain Lebresne added a comment - To be precise, what I'm saying is that (at least in theory) the following scenario would be possible: A read-for-cache read the memtables grabing updates then it start reading the sstables while the previous happens, a new update arrives. The memtable is then flushed and happens to be fully flushed before our read-for-cache completes. In that case, the new update won't be part of the cached row (ever) because during the flush (when we would merge the memtable to the cache) the row was not in the cache yet. That may seem far fetched but consider a simple implementation of you proposition, where the 'upon flush merge memtables with cache' phase happens in the same loop over rows that is used for flushing. It is actually possible for a new write to be "flushed" within a few milliseconds of being received by the node: if the update triggers the memtable threshold and sorts at the very beginning of the memtable. But don't get me wrong, it would probably be possible to deal with that problem, but it feels a bit complicated and error prone.
        Hide
        slebresne Sylvain Lebresne added a comment -

        Patch attached with my "sentinel" idea (The patch is against 1.1 currently). I think this fixes the problem, and this deal with counters.

        Show
        slebresne Sylvain Lebresne added a comment - Patch attached with my "sentinel" idea (The patch is against 1.1 currently). I think this fixes the problem, and this deal with counters.
        Hide
        doubleday Daniel Doubleday added a comment -

        Just had a look at it and maybe I got it wrong but:

        CFS.getRawCachedRow returns null for a sentinel and CFS.updateRowCache calls this.
        Isn't updateRowCache supposed to add changes to the sentinel so that cacheRow can detect the race?

        Show
        doubleday Daniel Doubleday added a comment - Just had a look at it and maybe I got it wrong but: CFS.getRawCachedRow returns null for a sentinel and CFS.updateRowCache calls this. Isn't updateRowCache supposed to add changes to the sentinel so that cacheRow can detect the race?
        Hide
        slebresne Sylvain Lebresne added a comment -

        You're right, thanks for catching that. Attached v2 fixed this (I realized that when we hit the cache during range_slice queries we don't update the statistics, which I'm not sure is what we want, but it's unrelated to that issue so haven't changed it).

        Show
        slebresne Sylvain Lebresne added a comment - You're right, thanks for catching that. Attached v2 fixed this (I realized that when we hit the cache during range_slice queries we don't update the statistics, which I'm not sure is what we want, but it's unrelated to that issue so haven't changed it).
        Hide
        jbellis Jonathan Ellis added a comment -

        Looks to me like we might be able to simplify things by splitting the "initialize row cache" code (which can assume the cache is empty, and does not need a filter) out from the "look up a cached row and cache it if it is not present" method.

        Nit: although not perfect, IMO "getRawCachedRow" is a better method name than "getCachedRowNoStats" – the important thing to convey is that we're only inspecting the cache's contents, not changing them.

        Show
        jbellis Jonathan Ellis added a comment - Looks to me like we might be able to simplify things by splitting the "initialize row cache" code (which can assume the cache is empty, and does not need a filter) out from the "look up a cached row and cache it if it is not present" method. Nit: although not perfect, IMO "getRawCachedRow" is a better method name than "getCachedRowNoStats" – the important thing to convey is that we're only inspecting the cache's contents, not changing them.
        Hide
        slebresne Sylvain Lebresne added a comment -

        Attaching v3. This mostly fix a but of the previous version where sentinels were not handled correctly in cacheRow(). I've also switch back to getRawCachedRow.
        I'm not fully sure what you proposed to split exactly, but v3 does split cacheRow() in the hope of increasing clarity.

        Show
        slebresne Sylvain Lebresne added a comment - Attaching v3. This mostly fix a but of the previous version where sentinels were not handled correctly in cacheRow(). I've also switch back to getRawCachedRow. I'm not fully sure what you proposed to split exactly, but v3 does split cacheRow() in the hope of increasing clarity.
        Hide
        jbellis Jonathan Ellis added a comment -

        Attached cleanup patch that applies on top of v3. Most of the changes are adding docstrings/comments and cleaning up typos.

        A minor change to the code was to make cacheRow take just cfId and filter, removing the redundant filter.key as a parameter.

        I also renamed cacheRow to getThroughCache. Still not 100% happy with that, but my goal is to make the distinction between readAndCache more obvious.

        Finally, I've modified the logic in invalidateCachedRow according to the reasoning in this comment:

        .       // This method is used to (1) drop obsolete entries from a copying cache after the row in question was updated
                // and to (2) make sure we're not wasting cache space on rows that don't exist anymore post-compaction.
                // Sentinels complicate this because it means we've caught a read thread in the process of loading
                // the cache, and we don't know (in case 2) if it will do so with rows from before the compaction or after,
                // so we need to loop until the load completes.
        

        (I also negated the loop condition, which looked like an oversight.)

        Show
        jbellis Jonathan Ellis added a comment - Attached cleanup patch that applies on top of v3. Most of the changes are adding docstrings/comments and cleaning up typos. A minor change to the code was to make cacheRow take just cfId and filter, removing the redundant filter.key as a parameter. I also renamed cacheRow to getThroughCache. Still not 100% happy with that, but my goal is to make the distinction between readAndCache more obvious. Finally, I've modified the logic in invalidateCachedRow according to the reasoning in this comment: . // This method is used to (1) drop obsolete entries from a copying cache after the row in question was updated // and to (2) make sure we're not wasting cache space on rows that don't exist anymore post-compaction. // Sentinels complicate this because it means we've caught a read thread in the process of loading // the cache, and we don't know (in case 2) if it will do so with rows from before the compaction or after, // so we need to loop until the load completes. (I also negated the loop condition, which looked like an oversight.)
        Hide
        slebresne Sylvain Lebresne added a comment -

        The cleanup lgtm.

        For the change to invalidateCacheRow however, I wonder if it's worth it. By waiting when we found a sentinel, we may have writes waiting on a read to complete, which could involve a non negligible latency spike. On the other side, if we just leave the sentinel in that case, the only risk we take is that a read may end up putting tombstone in the cache that are already expired. But it doesn't seem like a big deal, especially given that it will very rarely happen.

        But in any case, you're right about negating the loop condition.

        Show
        slebresne Sylvain Lebresne added a comment - The cleanup lgtm. For the change to invalidateCacheRow however, I wonder if it's worth it. By waiting when we found a sentinel, we may have writes waiting on a read to complete, which could involve a non negligible latency spike. On the other side, if we just leave the sentinel in that case, the only risk we take is that a read may end up putting tombstone in the cache that are already expired. But it doesn't seem like a big deal, especially given that it will very rarely happen. But in any case, you're right about negating the loop condition.
        Hide
        jbellis Jonathan Ellis added a comment -

        By waiting when we found a sentinel, we may have writes waiting on a read to complete, which could involve a non negligible latency spike

        You're right, that's a worse negative than leaving tombstones in the cache. I'm fine with changing it back if you update the comments accordingly.

        Show
        jbellis Jonathan Ellis added a comment - By waiting when we found a sentinel, we may have writes waiting on a read to complete, which could involve a non negligible latency spike You're right, that's a worse negative than leaving tombstones in the cache. I'm fine with changing it back if you update the comments accordingly.
        Hide
        slebresne Sylvain Lebresne added a comment -

        Actually then handling of the copying patch by the preceding patches is wrong. When a put arrives and there is a sentinel, the patch does not add the put to the sentinel correctly. But thinking about it, for the copying cache, we should avoid having writes check the current value in the cache, because that have a non-negligible performance impact. What we should do is let invalidate actually invalidate sentinels. The only problem we're faced with if we do that, is that when a read-for-caching returns, it must make sure his own sentinel hasn't been invalidated. And in particular it must be careful of the case where the sentinel has been invalidated and another read has set another sentinel.

        Anyway, attaching a v4 (that include the comments cleanups) that choose that strategy instead (and thus is (hopefully) not buggy even in the copying cache case). Note that it means that reads must be able to identify sentinels uniquely (not based on the content), so the code assign a unique ID to sentinel and use that for comparison.

        Show
        slebresne Sylvain Lebresne added a comment - Actually then handling of the copying patch by the preceding patches is wrong. When a put arrives and there is a sentinel, the patch does not add the put to the sentinel correctly. But thinking about it, for the copying cache, we should avoid having writes check the current value in the cache, because that have a non-negligible performance impact. What we should do is let invalidate actually invalidate sentinels. The only problem we're faced with if we do that, is that when a read-for-caching returns, it must make sure his own sentinel hasn't been invalidated. And in particular it must be careful of the case where the sentinel has been invalidated and another read has set another sentinel. Anyway, attaching a v4 (that include the comments cleanups) that choose that strategy instead (and thus is (hopefully) not buggy even in the copying cache case). Note that it means that reads must be able to identify sentinels uniquely (not based on the content), so the code assign a unique ID to sentinel and use that for comparison.
        Hide
        jbellis Jonathan Ellis added a comment -

        Attached v5 with a simpler approach: for serializing cache, getThroughCache does a classic CAS loop with a sentinel vs the write's invalidate.

        v5 also adds a containsCachedRow method to CFS so that callers that don't care about the value don't force a deserialize in the serializing cache case.

        Show
        jbellis Jonathan Ellis added a comment - Attached v5 with a simpler approach: for serializing cache, getThroughCache does a classic CAS loop with a sentinel vs the write's invalidate. v5 also adds a containsCachedRow method to CFS so that callers that don't care about the value don't force a deserialize in the serializing cache case.
        Hide
        slebresne Sylvain Lebresne added a comment -

        I don't think v5 works. All sentinels are empty CF, so all sentinels will be equal (in SerializingCache.contentsEqual()). Which means we can have the following sequence of actions:

        • a read r1 comes, the cache is empty, it sets sentinel s1 and start reading from disk
        • a write w comes and invalidate s1.
        • a read r2 comes, the cache is (now) empty, it sets sentinel s2 and start reading from disk
        • r1 finish reading from disk having missed w. It'll do the replace, but since all sentinel are equals this will succeed (even though the current sentinel is the one of the second read) and we'll end up having missed w.

        That's the reason of the sentinel IDs of v4.

        Show
        slebresne Sylvain Lebresne added a comment - I don't think v5 works. All sentinels are empty CF, so all sentinels will be equal (in SerializingCache.contentsEqual()). Which means we can have the following sequence of actions: a read r1 comes, the cache is empty, it sets sentinel s1 and start reading from disk a write w comes and invalidate s1. a read r2 comes, the cache is (now) empty, it sets sentinel s2 and start reading from disk r1 finish reading from disk having missed w. It'll do the replace, but since all sentinel are equals this will succeed (even though the current sentinel is the one of the second read) and we'll end up having missed w. That's the reason of the sentinel IDs of v4.
        Hide
        jbellis Jonathan Ellis added a comment -

        v6 pulls RCS out to a separate file and adds a uuid version and equals/hashcode methods. SerializingCacheProvider uses a custom CF serializer that is RCS-aware. SerializingCache.replace is simplified to use RCS.equals. CAS loop is extended to non-serializing cache: since cache/write race is extremely rare, I'd rather take the occasional re-read penalty, than increase the overhead of every row in the cache by making them RCS objects permanently.

        Show
        jbellis Jonathan Ellis added a comment - v6 pulls RCS out to a separate file and adds a uuid version and equals/hashcode methods. SerializingCacheProvider uses a custom CF serializer that is RCS-aware. SerializingCache.replace is simplified to use RCS.equals. CAS loop is extended to non-serializing cache: since cache/write race is extremely rare, I'd rather take the occasional re-read penalty, than increase the overhead of every row in the cache by making them RCS objects permanently.
        Hide
        slebresne Sylvain Lebresne added a comment -

        Remarks on v6:

        • Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily. We should probably create a CachedRow class extended by both Sentinel (that would really just be an identifier, no metadata needed) and ColumnFamily and use that as cache values. It'll be cleaner and more importantly more type safe (a cache lookup won't be able to ignore by mistake that it could get a sentinel).
        • Not adding stuffs to the sentinel also mean that in getThroughCache the counter special case is not needed anymore.
        • In getThroughCache, if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading. Better let the next client read cache the data than getting a crappy latency on the current read.
        • Is it really an improvement to use UUIDs (over an AtomicLong)? I have nothing against UUID per se but it takes twice the space (and we serialize them) and without having benchmarked it, I'm willing to bet are much faster to generate. And let's be honest, the risk of overflow with an AtomicLong is science-fiction (or to be precise, at 1 millions sentinels created per seconds (which is way more than we'll ever see), you'd need more than 100,000 year of uptime to overflow).
        Show
        slebresne Sylvain Lebresne added a comment - Remarks on v6: Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily. We should probably create a CachedRow class extended by both Sentinel (that would really just be an identifier, no metadata needed) and ColumnFamily and use that as cache values. It'll be cleaner and more importantly more type safe (a cache lookup won't be able to ignore by mistake that it could get a sentinel). Not adding stuffs to the sentinel also mean that in getThroughCache the counter special case is not needed anymore. In getThroughCache, if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading. Better let the next client read cache the data than getting a crappy latency on the current read. Is it really an improvement to use UUIDs (over an AtomicLong)? I have nothing against UUID per se but it takes twice the space (and we serialize them) and without having benchmarked it, I'm willing to bet are much faster to generate. And let's be honest, the risk of overflow with an AtomicLong is science-fiction (or to be precise, at 1 millions sentinels created per seconds (which is way more than we'll ever see), you'd need more than 100,000 year of uptime to overflow).
        Hide
        jbellis Jonathan Ellis added a comment - - edited

        v7 attached.

        Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily

        True, but when I tried this I ended up with a LOT of casting cache values to CF. I think it might be the lesser of evils the way it is.

        the counter special case is not needed anymore

        Updated.

        if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading

        Makes sense, updated.

        Is it really an improvement to use UUIDs (over an AtomicLong)?

        I'd rather have the reduced contention on instantiation than the 8 bytes of space (during the sentinel lifetime – this goes away once the sentinel is replaced by the data CF).

        Show
        jbellis Jonathan Ellis added a comment - - edited v7 attached. Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily True, but when I tried this I ended up with a LOT of casting cache values to CF. I think it might be the lesser of evils the way it is. the counter special case is not needed anymore Updated. if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading Makes sense, updated. Is it really an improvement to use UUIDs (over an AtomicLong)? I'd rather have the reduced contention on instantiation than the 8 bytes of space (during the sentinel lifetime – this goes away once the sentinel is replaced by the data CF).
        Hide
        slebresne Sylvain Lebresne added a comment -
        • In SerializingCache, remove misses a "not" in the while condition (this date back from one of my earlier patch). We don't need that new remove method anymore though so it's probably as simple to just remove it from the patch.
        • In CFS.getThroughCache, the following line
          boolean sentinelSuccess = !CacheService.instance.rowCache.putIfAbsent(key, sentinel);
          

          should not be negated.

        • Also in CFS.getThroughCache, we won't remove the sentinel if there is an exception during the read. It's not a big deal but it doesn't cost much to prevent it from happening.

        True, but when I tried this I ended up with a LOT of casting cache values to CF. I think it might be the lesser of evils the way it is.

        In that case, I think I wouldn't mind too much casts and I would prefer getting the type safety of knowing that a method that take a ColumnFamily can't ever get a sentinel (and to make it explicit when you need to care about sentinel or not), but that's a bit subjective. There would also be some small wins like the fact we wouldn't need to save the cfId when serializing a sentinel.

        I'd rather have the reduced contention on instantiation than the 8 bytes of space

        My point was that the UUID don't reduce contention. UUID.randomUUID() uses SecureRandom.nextBytes() that is synchronized (and thus likely entail a much bigger degradation in face of contention than an AtomicLong) and probably a bit CPU intensive. For reference, I did a quick micro-benchmark having 50 threads generating 10,000 ids simultaneously using both methods, using an AtomicLong is two orders of magnitude faster.

        Show
        slebresne Sylvain Lebresne added a comment - In SerializingCache, remove misses a "not" in the while condition (this date back from one of my earlier patch). We don't need that new remove method anymore though so it's probably as simple to just remove it from the patch. In CFS.getThroughCache, the following line boolean sentinelSuccess = !CacheService.instance.rowCache.putIfAbsent(key, sentinel); should not be negated. Also in CFS.getThroughCache, we won't remove the sentinel if there is an exception during the read. It's not a big deal but it doesn't cost much to prevent it from happening. True, but when I tried this I ended up with a LOT of casting cache values to CF. I think it might be the lesser of evils the way it is. In that case, I think I wouldn't mind too much casts and I would prefer getting the type safety of knowing that a method that take a ColumnFamily can't ever get a sentinel (and to make it explicit when you need to care about sentinel or not), but that's a bit subjective. There would also be some small wins like the fact we wouldn't need to save the cfId when serializing a sentinel. I'd rather have the reduced contention on instantiation than the 8 bytes of space My point was that the UUID don't reduce contention. UUID.randomUUID() uses SecureRandom.nextBytes() that is synchronized (and thus likely entail a much bigger degradation in face of contention than an AtomicLong) and probably a bit CPU intensive. For reference, I did a quick micro-benchmark having 50 threads generating 10,000 ids simultaneously using both methods, using an AtomicLong is two orders of magnitude faster.
        Hide
        jbellis Jonathan Ellis added a comment -

        v8 attached w/ long sentinel and IRowCacheEntry.

        Show
        jbellis Jonathan Ellis added a comment - v8 attached w/ long sentinel and IRowCacheEntry.
        Hide
        slebresne Sylvain Lebresne added a comment -

        v8 lgtm mostly except for the 3 remarks at the beginning of my previous comment. Added simple patch on top of v8 that does the proposed modifications.

        Show
        slebresne Sylvain Lebresne added a comment - v8 lgtm mostly except for the 3 remarks at the beginning of my previous comment. Added simple patch on top of v8 that does the proposed modifications.
        Hide
        jbellis Jonathan Ellis added a comment -

        shouldn't if (data == null) in the finally block be if (sentinelSuccess && data == null) ?

        Show
        jbellis Jonathan Ellis added a comment - shouldn't if (data == null) in the finally block be if (sentinelSuccess && data == null) ?
        Hide
        slebresne Sylvain Lebresne added a comment -

        Oups, you're right. Patch updated.

        Show
        slebresne Sylvain Lebresne added a comment - Oups, you're right. Patch updated.
        Hide
        jbellis Jonathan Ellis added a comment -

        +1

        Show
        jbellis Jonathan Ellis added a comment - +1
        Hide
        slebresne Sylvain Lebresne added a comment -

        Committed, thanks

        Show
        slebresne Sylvain Lebresne added a comment - Committed, thanks
        Hide
        hudson Hudson added a comment -

        Integrated in Cassandra #1646 (See https://builds.apache.org/job/Cassandra/1646/)
        restore pre-CASSANDRA-3862 approach to removing expired tombstones during compaction (Revision fbb5ec0374e1a5f1b24680f1604b6e9201fb535f)
        fix build - re-add CompactionController.removeDeletedInCache for commit fbb5ec0374e1a5f1b24680f1604b6e9201fb535f restore pre-CASSANDRA-3862 approach to removing expired tombstones during compaction (Revision 086c06ad7fb211de6be877c3c1ea2ee4f86c6d7e)

        Result = ABORTED
        jbellis :
        Files :

        • src/java/org/apache/cassandra/db/compaction/CompactionIterable.java
        • CHANGES.txt

        dbrosius :
        Files :

        • src/java/org/apache/cassandra/db/compaction/CompactionController.java
        Show
        hudson Hudson added a comment - Integrated in Cassandra #1646 (See https://builds.apache.org/job/Cassandra/1646/ ) restore pre- CASSANDRA-3862 approach to removing expired tombstones during compaction (Revision fbb5ec0374e1a5f1b24680f1604b6e9201fb535f) fix build - re-add CompactionController.removeDeletedInCache for commit fbb5ec0374e1a5f1b24680f1604b6e9201fb535f restore pre- CASSANDRA-3862 approach to removing expired tombstones during compaction (Revision 086c06ad7fb211de6be877c3c1ea2ee4f86c6d7e) Result = ABORTED jbellis : Files : src/java/org/apache/cassandra/db/compaction/CompactionIterable.java CHANGES.txt dbrosius : Files : src/java/org/apache/cassandra/db/compaction/CompactionController.java

          People

          • Assignee:
            slebresne Sylvain Lebresne
            Reporter:
            doubleday Daniel Doubleday
            Reviewer:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development