Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 2.0 beta 1
    • Component/s: None
    • Labels:
      None

      Description

      Per-CF TTL would allow compaction optimizations ("drop an entire sstable's worth of expired data") that we can't do with per-column.

      1. 3974-v9.txt
        36 kB
        Kirk True
      2. 3974-v8.txt
        36 kB
        Jonathan Ellis
      3. trunk-3974v7.txt
        40 kB
        Kirk True
      4. trunk-3974v6.txt
        40 kB
        Kirk True
      5. trunk-3974v5.txt
        40 kB
        Kirk True
      6. trunk-3974v4.txt
        51 kB
        Kirk True
      7. trunk-3974v3.txt
        42 kB
        Kirk True
      8. trunk-3974v2.txt
        38 kB
        Kirk True
      9. trunk-3974.txt
        25 kB
        Kirk True

        Activity

        Jonathan Ellis created issue -
        Hide
        Dave Brosius added a comment -

        What happens if a column in a ttl'ed column family has a ttl that's longer than the cf's ttl? Would that be allowed?

        Show
        Dave Brosius added a comment - What happens if a column in a ttl'ed column family has a ttl that's longer than the cf's ttl? Would that be allowed?
        Hide
        Jonathan Ellis added a comment -

        It would have to be min(cf ttl, column ttl) to be useful.

        Show
        Jonathan Ellis added a comment - It would have to be min(cf ttl, column ttl) to be useful.
        Hide
        Leonardo Stern added a comment -

        This is related to CASSANDRA-3077, Also very helpful in map/reduce scenarios.

        Show
        Leonardo Stern added a comment - This is related to CASSANDRA-3077 , Also very helpful in map/reduce scenarios.
        Jonathan Ellis made changes -
        Field Original Value New Value
        Fix Version/s 1.2 [ 12319262 ]
        Kirk True made changes -
        Assignee Kirk True [ kirktrue ]
        Hide
        Aleksey Vorona added a comment -

        Copying real life use cases which need this feature from the older bug ( CASSANDRA-3077 ):

        1. I want one of my CFs not to store any data older than two months. It is a "notifications" CF which is of no interest to user past certain point in time.
        Currently I am setting TTL with each insert in the CF, but since it is a constant it makes sense to me to have it configured in CF definition to apply automatically to all rows in the CF.

        2. Default TTL would be very helpfull in Map/Reduce scenarios where you dont have direct control of TTL (IE: hive)

        Show
        Aleksey Vorona added a comment - Copying real life use cases which need this feature from the older bug ( CASSANDRA-3077 ): 1. I want one of my CFs not to store any data older than two months. It is a "notifications" CF which is of no interest to user past certain point in time. Currently I am setting TTL with each insert in the CF, but since it is a constant it makes sense to me to have it configured in CF definition to apply automatically to all rows in the CF. 2. Default TTL would be very helpfull in Map/Reduce scenarios where you dont have direct control of TTL (IE: hive)
        Hide
        Kirk True added a comment -

        This is a proof-of-concept patch for allowing optional default TTLs on a column family.

        I'm not sure how to implement the "compaction optimizations" the main JIRA description mentions.

        Please provide feedback on what needs to be added/changed.

        Thanks.

        Show
        Kirk True added a comment - This is a proof-of-concept patch for allowing optional default TTLs on a column family. I'm not sure how to implement the "compaction optimizations" the main JIRA description mentions. Please provide feedback on what needs to be added/changed. Thanks.
        Kirk True made changes -
        Attachment trunk-3974.txt [ 12521365 ]
        Kirk True made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Jonathan Ellis added a comment -

        Thanks Kirk!

        My comments:

        • Looks like this only updates the CQL path? We'd want to make the Thrift path cf-ttl-aware as well. I think this just means updating RowMutation + CF addColumn methods.
        • Nit: we could simplify getTTL a bit by adding assert ttl > 0.
        • I got it backwards: we want max(cf ttl, column ttl) to be able to reason about the live-ness of CF data w/o looking at individual rows
        • We can break the compaction optimizations into another ticket. It really needs a separate compaction Strategy; the idea is if we have an sstable A older than CF ttl, then all the data in the file is dead and we can just delete the file without looking at it row-by-row. However, there's a lot of tension there with the goal of normal compaction, which wants to merge different versions of the same row, so we're going to churn a lot with a low chance of ever having an sstable last the full TTL without being merged, effectively restarting our timer. So, I think we're best served by a ArchivingCompactionStrategy that doesn't merge sstables at all, just drops obsolete ones, and let people use that for append-only insert workloads. Which is a common enough case that it's worth the trouble... probably.
        Show
        Jonathan Ellis added a comment - Thanks Kirk! My comments: Looks like this only updates the CQL path? We'd want to make the Thrift path cf-ttl-aware as well. I think this just means updating RowMutation + CF addColumn methods. Nit: we could simplify getTTL a bit by adding assert ttl > 0. I got it backwards: we want max(cf ttl, column ttl) to be able to reason about the live-ness of CF data w/o looking at individual rows We can break the compaction optimizations into another ticket. It really needs a separate compaction Strategy; the idea is if we have an sstable A older than CF ttl, then all the data in the file is dead and we can just delete the file without looking at it row-by-row. However, there's a lot of tension there with the goal of normal compaction, which wants to merge different versions of the same row, so we're going to churn a lot with a low chance of ever having an sstable last the full TTL without being merged, effectively restarting our timer. So, I think we're best served by a ArchivingCompactionStrategy that doesn't merge sstables at all, just drops obsolete ones, and let people use that for append-only insert workloads. Which is a common enough case that it's worth the trouble... probably.
        Hide
        Kirk True added a comment -

        Jonathan, thanks for the feedback.

        I need a bit of clarification for a newbie hacking on the code...

        Looks like this only updates the CQL path? We'd want to make the Thrift path cf-ttl-aware as well. I think this just means updating RowMutation + CF addColumn methods.

        I actually thought the opposite. Part of the code I changed was in CFMetaData's toThrift and fromThrift methods. Perhaps I'm reading too much into the method names?

        But I took a look at ColumnFamily's addColumn method, but it already performs the conditional based on the TTL value.

        Nit: we could simplify getTTL a bit by adding assert ttl > 0.

        Sorry, I'm not sure to which part of the code you're referring Can you elaborate?

        I got it backwards: we want max(cf ttl, column ttl) to be able to reason about the live-ness of CF data w/o looking at individual rows

        I cleaned up the CFMetaData.getTimeToLive method, which is now simply:

        public int getTimeToLive(int timeToLive)
        {
            return Math.max(defaultTimeToLive, timeToLive);
        }
        

        We can break the compaction optimizations into another ticket. It really needs a separate compaction Strategy; the idea is if we have an sstable A older than CF ttl, then all the data in the file is dead and we can just delete the file without looking at it row-by-row. However, there's a lot of tension there with the goal of normal compaction, which wants to merge different versions of the same row, so we're going to churn a lot with a low chance of ever having an sstable last the full TTL without being merged, effectively restarting our timer. So, I think we're best served by a ArchivingCompactionStrategy that doesn't merge sstables at all, just drops obsolete ones, and let people use that for append-only insert workloads. Which is a common enough case that it's worth the trouble... probably.

        Either way is fine. Would love to contribute.

        Show
        Kirk True added a comment - Jonathan, thanks for the feedback. I need a bit of clarification for a newbie hacking on the code... Looks like this only updates the CQL path? We'd want to make the Thrift path cf-ttl-aware as well. I think this just means updating RowMutation + CF addColumn methods. I actually thought the opposite. Part of the code I changed was in CFMetaData 's toThrift and fromThrift methods. Perhaps I'm reading too much into the method names? But I took a look at ColumnFamily 's addColumn method, but it already performs the conditional based on the TTL value. Nit: we could simplify getTTL a bit by adding assert ttl > 0. Sorry, I'm not sure to which part of the code you're referring Can you elaborate? I got it backwards: we want max(cf ttl, column ttl) to be able to reason about the live-ness of CF data w/o looking at individual rows I cleaned up the CFMetaData.getTimeToLive method, which is now simply: public int getTimeToLive(int timeToLive) { return Math.max(defaultTimeToLive, timeToLive); } We can break the compaction optimizations into another ticket. It really needs a separate compaction Strategy; the idea is if we have an sstable A older than CF ttl, then all the data in the file is dead and we can just delete the file without looking at it row-by-row. However, there's a lot of tension there with the goal of normal compaction, which wants to merge different versions of the same row, so we're going to churn a lot with a low chance of ever having an sstable last the full TTL without being merged, effectively restarting our timer. So, I think we're best served by a ArchivingCompactionStrategy that doesn't merge sstables at all, just drops obsolete ones, and let people use that for append-only insert workloads. Which is a common enough case that it's worth the trouble... probably. Either way is fine. Would love to contribute.
        Hide
        Jonathan Ellis added a comment - - edited

        Part of the code I changed was in CFMetaData's toThrift and fromThrift methods

        Let me back up. I can see two main approaches towards respecting the per-CF ttl:

        1. Set the column TTL to the max(column, CF) ttl on insert; then the rest of the code doesn't have to know anything changed
        2. Take max(column, CF) ttl during operations like compaction, and leave column ttl (which is to say, ExpiringColumn objects) to specify only the column TTL

        The code in UpdateStatement led me to believe you're going with option 1. So what I meant by my comment was, you need to make a similar change for inserts done over Thrift RPC, as well. (to/from Thrift methods are used for telling Thrift clients about the schema, but are not used for insert/update operations.)

        Does that help?

        Sorry, I'm not sure to which part of the code you're referring

        CFMetadata.getTimeToLive. Sounds like you addressed this anyway.

        Show
        Jonathan Ellis added a comment - - edited Part of the code I changed was in CFMetaData's toThrift and fromThrift methods Let me back up. I can see two main approaches towards respecting the per-CF ttl: Set the column TTL to the max(column, CF) ttl on insert; then the rest of the code doesn't have to know anything changed Take max(column, CF) ttl during operations like compaction, and leave column ttl (which is to say, ExpiringColumn objects) to specify only the column TTL The code in UpdateStatement led me to believe you're going with option 1. So what I meant by my comment was, you need to make a similar change for inserts done over Thrift RPC, as well. (to/from Thrift methods are used for telling Thrift clients about the schema, but are not used for insert/update operations.) Does that help? Sorry, I'm not sure to which part of the code you're referring CFMetadata.getTimeToLive. Sounds like you addressed this anyway.
        Hide
        Kirk True added a comment - - edited

        In the initial patch, I had made changes to both UpdateStatement.addToMutation and ColumnFamily.addColumn to use the larger of the column's TTL or the column family default TTL. I tested against the cassandra-cli and cqlsh tools and both show the default TTL being used if none is specified.

        This is all to say that it looks like both the Thrift and CQL paths are working as expected. Perhaps it's high time I found the unit tests and added some...

        Show
        Kirk True added a comment - - edited In the initial patch, I had made changes to both UpdateStatement.addToMutation and ColumnFamily.addColumn to use the larger of the column's TTL or the column family default TTL. I tested against the cassandra-cli and cqlsh tools and both show the default TTL being used if none is specified. This is all to say that it looks like both the Thrift and CQL paths are working as expected. Perhaps it's high time I found the unit tests and added some...
        Hide
        Jonathan Ellis added a comment -

        In the initial patch, I had made changes to both UpdateStatement.addToMutation and ColumnFamily.addColumn to use the larger of the column's TTL or the column family default TTL

        Oops, I totally missed the addColumn changes. That's exactly what I had in mind.

        It sounds like you have an updated patch, could you post that?

        Show
        Jonathan Ellis added a comment - In the initial patch, I had made changes to both UpdateStatement.addToMutation and ColumnFamily.addColumn to use the larger of the column's TTL or the column family default TTL Oops, I totally missed the addColumn changes. That's exactly what I had in mind. It sounds like you have an updated patch, could you post that?
        Hide
        Kirk True added a comment -

        There are a few pieces missing yet:

        1. The ability to alter a column family to change the default TTL option. Because I made the change to use max(column TTL, CF TTL) at column mutate time, altering the column family default TTL value will be "lost" on such columns.
        2. I'm fighting with Python to understand why 'DESCRIBE COLUMNFAMILY FOO' doesn't show the new default TTL value. I made the change in the Python and Java layers to accept this new option, but the describe fails to display it.
        Show
        Kirk True added a comment - There are a few pieces missing yet: The ability to alter a column family to change the default TTL option. Because I made the change to use max(column TTL, CF TTL) at column mutate time, altering the column family default TTL value will be "lost" on such columns. I'm fighting with Python to understand why 'DESCRIBE COLUMNFAMILY FOO' doesn't show the new default TTL value. I made the change in the Python and Java layers to accept this new option, but the describe fails to display it.
        Hide
        Kirk True added a comment -

        What are the semantics of updating the CF TTL? Should updating the CF TTL effect existing columns? If so, we would not want to use max(column TTL, CF TTL) at column mutation time but keeping them separate and dynamic to evaluate liveness at some other event (column retrieval and/or compaction).

        Show
        Kirk True added a comment - What are the semantics of updating the CF TTL? Should updating the CF TTL effect existing columns? If so, we would not want to use max(column TTL, CF TTL) at column mutation time but keeping them separate and dynamic to evaluate liveness at some other event (column retrieval and/or compaction).
        Hide
        Dave Brosius added a comment -

        JBellis... you mean min, right?

        Show
        Dave Brosius added a comment - JBellis... you mean min, right?
        Hide
        Jonathan Ellis added a comment -

        I mean, a CF ttl of X is useful only if it lets us reason that an sstable written more than X seconds ago is entirely expired. So... min?

        Is describe caching the schema as in CASSANDRA-4052?

        Agreed that if we want to allow altering CF ttl, keeping it separate from column ttl until we need to check for expired-ness makes the most sense.

        Show
        Jonathan Ellis added a comment - I mean, a CF ttl of X is useful only if it lets us reason that an sstable written more than X seconds ago is entirely expired. So... min? Is describe caching the schema as in CASSANDRA-4052 ? Agreed that if we want to allow altering CF ttl, keeping it separate from column ttl until we need to check for expired-ness makes the most sense.
        Hide
        Kirk True added a comment - - edited

        My understanding is that in order to reduce potential user confusion when updating the column family's default TTL, we need to keep the column family's default TTL value separate. That is, we probably don't want to make ExpiringColumn instances for a column family that has a default TTL (using min(CF TTL, column TTL) as the TTL value). Instead, we keep the logic as is and keep the column family's default TTL value in CFMetaData.

        That's all fine and good, but looking at the code I'm not quite sure as to when we'd check the column family default TTL. It would seem that we need to pass a CFMetaData instance in to Column's isMarkedForDelete so that it can perform logic such as:

            public boolean isMarkedForDelete(CFMetaData metadata)
            {
                if (metadata.getDefaultTimeToLive() > 0)
                {
                    // Check if we're using a CF-based TTL.
                    return System.currentTimeMillis() >= (timestamp + (metadata.getDefaultTimeToLive() * 1000));
                }
                else
                {
                    return (int) (System.currentTimeMillis() / 1000) >= getLocalDeletionTime();
                }
            }
        

        Is this the correct line of thought? If so, that changes a couple of dozen call sites which makes me wonder if I'm doing something wrong

        Show
        Kirk True added a comment - - edited My understanding is that in order to reduce potential user confusion when updating the column family's default TTL, we need to keep the column family's default TTL value separate. That is, we probably don't want to make ExpiringColumn instances for a column family that has a default TTL (using min(CF TTL, column TTL) as the TTL value). Instead, we keep the logic as is and keep the column family's default TTL value in CFMetaData . That's all fine and good, but looking at the code I'm not quite sure as to when we'd check the column family default TTL. It would seem that we need to pass a CFMetaData instance in to Column 's isMarkedForDelete so that it can perform logic such as: public boolean isMarkedForDelete(CFMetaData metadata) { if (metadata.getDefaultTimeToLive() > 0) { // Check if we're using a CF-based TTL. return System.currentTimeMillis() >= (timestamp + (metadata.getDefaultTimeToLive() * 1000)); } else { return (int) (System.currentTimeMillis() / 1000) >= getLocalDeletionTime(); } } Is this the correct line of thought? If so, that changes a couple of dozen call sites which makes me wonder if I'm doing something wrong
        Hide
        Kirk True added a comment -

        On IRC Jonathan suggested to look at ColumnFamilyStore.removeDeleted and PrecompactedRow.removeDeletedAndOldShards. However, at doesn't appear that either of these are called during column reads so I can't rely on those to filter out results sent back to the client.

        The logic that I see for filtering out results sent to the client is in places such as CassandraServer.thriftifyColumns via the IColumn.isMarkedForDelete call. However, as stated previously, since an IColumn doesn't internally store a CFMetaData object, we'd have to pass one in. isMarkedForDelete is used in a lot of places, so it has a ripple effect that causes a lot of changes.

        Please advise.

        Show
        Kirk True added a comment - On IRC Jonathan suggested to look at ColumnFamilyStore.removeDeleted and PrecompactedRow.removeDeletedAndOldShards . However, at doesn't appear that either of these are called during column reads so I can't rely on those to filter out results sent back to the client. The logic that I see for filtering out results sent to the client is in places such as CassandraServer.thriftifyColumns via the IColumn.isMarkedForDelete call. However, as stated previously, since an IColumn doesn't internally store a CFMetaData object, we'd have to pass one in. isMarkedForDelete is used in a lot of places, so it has a ripple effect that causes a lot of changes. Please advise.
        Kirk True made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Jonathan Ellis added a comment -

        CFS.removeDeleted is the one that's called during column reads. E.g., SliceByNamesReadCommand.getRow -> Table.getRow -> CFS.getColumnFamily -> CFS.removeDeleted

        Show
        Jonathan Ellis added a comment - CFS.removeDeleted is the one that's called during column reads. E.g., SliceByNamesReadCommand.getRow -> Table.getRow -> CFS.getColumnFamily -> CFS.removeDeleted
        Hide
        Kirk True added a comment - - edited

        In my case (inserting data and then calling list my_cf from the CLI), it goes through the RangeSliceCommand path which doesn't end up calling ColumnFamilyStorage.getColumnFamily. As such, the expired-and-thus-should-be-ignored columns are still showing up.

        Show
        Kirk True added a comment - - edited In my case (inserting data and then calling list my_cf from the CLI), it goes through the RangeSliceCommand path which doesn't end up calling ColumnFamilyStorage.getColumnFamily . As such, the expired-and-thus-should-be-ignored columns are still showing up.
        Hide
        Jonathan Ellis added a comment - - edited

        RSVH still goes through removeDeleted. It can either go through the index AbstractScanIterator which calls getCF at line 195 in KeysSearcher, or the seq scan iterator via CFS.filterColumnFamily (RowIteratorFactory line 111).

        Remember that these only remove expired tombstones; non-expired ones need to be returned to the coordinator for read repair.

        Show
        Jonathan Ellis added a comment - - edited RSVH still goes through removeDeleted. It can either go through the index AbstractScanIterator which calls getCF at line 195 in KeysSearcher, or the seq scan iterator via CFS.filterColumnFamily (RowIteratorFactory line 111). Remember that these only remove expired tombstones; non-expired ones need to be returned to the coordinator for read repair.
        Hide
        Kirk True added a comment -

        In my test case, it does go through RowIteratorFactory, but it doesn't go through line 111. In getReduced cached is always null so it calls the filter.collateColumns path.

        So I made this naive change:

        if (cached == null)
        {
            // not cached: collate
            filter.collateColumns(returnCF, colIters, gcBefore);
            returnCF = ColumnFamilyStore.removeDeleted(returnCF, gcBefore);
        }
        else
        {
            QueryFilter keyFilter = new QueryFilter(key, filter.path, filter.filter);
            returnCF = cfs.filterColumnFamily(cached, keyFilter, gcBefore);
        }
        

        Be "manually" calling removeDeleted I was able to get my columns filtered out as expected.

        I'm pretty sure this is incomplete or just plain wrong, but I wanted to get your thoughts.

        Show
        Kirk True added a comment - In my test case, it does go through RowIteratorFactory , but it doesn't go through line 111. In getReduced cached is always null so it calls the filter.collateColumns path. So I made this naive change: if (cached == null) { // not cached: collate filter.collateColumns(returnCF, colIters, gcBefore); returnCF = ColumnFamilyStore.removeDeleted(returnCF, gcBefore); } else { QueryFilter keyFilter = new QueryFilter(key, filter.path, filter.filter); returnCF = cfs.filterColumnFamily(cached, keyFilter, gcBefore); } Be "manually" calling removeDeleted I was able to get my columns filtered out as expected. I'm pretty sure this is incomplete or just plain wrong, but I wanted to get your thoughts.
        Hide
        Jonathan Ellis added a comment -

        Hmm. I think you've found a bug...

        Show
        Jonathan Ellis added a comment - Hmm. I think you've found a bug...
        Hide
        Sylvain Lebresne added a comment -

        I does indeed seem that removeDeleted is not called on that path. I don't know if this show up as a bug though: columns shadowed by a row tombstone are removed by QueryFilter.isRelevant, and column tombstones are removed before being returned to the client. Yet, it would probably be cleaner to put removeDeleted on every path, especially since I agree with Jonathan that it's probably the right place to put the CF-TTL check.

        Actually I think that if we make sure to put removeDeleted on every path, we could probably make it the only method concerned with tombstone and remove QueryFilter.isRelelevant for instance, which would clean things up. But we can probably leave that to another ticket.

        Show
        Sylvain Lebresne added a comment - I does indeed seem that removeDeleted is not called on that path. I don't know if this show up as a bug though: columns shadowed by a row tombstone are removed by QueryFilter.isRelevant, and column tombstones are removed before being returned to the client. Yet, it would probably be cleaner to put removeDeleted on every path, especially since I agree with Jonathan that it's probably the right place to put the CF-TTL check. Actually I think that if we make sure to put removeDeleted on every path, we could probably make it the only method concerned with tombstone and remove QueryFilter.isRelelevant for instance, which would clean things up. But we can probably leave that to another ticket.
        Hide
        Jonathan Ellis added a comment -

        If we're leaving QF cleanup for another ticket, is this done / ready for review then?

        Show
        Jonathan Ellis added a comment - If we're leaving QF cleanup for another ticket, is this done / ready for review then?
        Hide
        Kirk True added a comment -

        I still need to implement unit tests. Do you have an suggestions as to an existing class into which they could be incorporated and/or good examples to copy?

        Show
        Kirk True added a comment - I still need to implement unit tests. Do you have an suggestions as to an existing class into which they could be incorporated and/or good examples to copy?
        Hide
        Kirk True added a comment -

        Also, given that the logic is min(CF TTL, column TTL), when the user explicitly provides a column TTL longer than the default column family TTL, I would think we'd either want to a) give an error, or b) provide a warning. At this point, the larger value provided by the user is simply ignored.

        Thoughts?

        Show
        Kirk True added a comment - Also, given that the logic is min(CF TTL, column TTL) , when the user explicitly provides a column TTL longer than the default column family TTL, I would think we'd either want to a) give an error, or b) provide a warning. At this point, the larger value provided by the user is simply ignored. Thoughts?
        Hide
        Kirk True added a comment -

        Filed CASSANDRA-4299 to handle the "QF cleanup" and 'putting removeDeleted on every path'.

        Show
        Kirk True added a comment - Filed CASSANDRA-4299 to handle the "QF cleanup" and 'putting removeDeleted on every path'.
        Hide
        Jonathan Ellis added a comment -

        when the user explicitly provides a column TTL longer than the default column family TTL, I would think we'd either want to a) give an error, or b) provide a warning

        I'd be in favor of (a).

        Show
        Jonathan Ellis added a comment - when the user explicitly provides a column TTL longer than the default column family TTL, I would think we'd either want to a) give an error, or b) provide a warning I'd be in favor of (a).
        Hide
        Leonardo Stern added a comment -

        As I user I prefer option (a)

        Show
        Leonardo Stern added a comment - As I user I prefer option (a)
        Kirk True made changes -
        Attachment trunk-3974v2.txt [ 12530875 ]
        Hide
        Kirk True added a comment -

        Things that still need to be done:

        • Understand why ColumnFamilyTTLTest's testColumnBasedTTLExpiration fails a basic column-based TTL expiration check
        • The check to ensure that the column TTL <= column family TTL in ColumnFamily.addColumn needs to be echoed somewhere up higher in the chain to provide a better message to the user
        Show
        Kirk True added a comment - Things that still need to be done: Understand why ColumnFamilyTTLTest 's testColumnBasedTTLExpiration fails a basic column-based TTL expiration check The check to ensure that the column TTL <= column family TTL in ColumnFamily.addColumn needs to be echoed somewhere up higher in the chain to provide a better message to the user
        Kirk True made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 1.2 [ 12319262 ]
        Hide
        Kirk True added a comment -

        CASSANDRA-4299 (removeDeleted clean up) blocks this as reads aren't presently calling removeDeleted and columns aren't being filtering out.

        Show
        Kirk True added a comment - CASSANDRA-4299 ( removeDeleted clean up) blocks this as reads aren't presently calling removeDeleted and columns aren't being filtering out.
        Hide
        Kirk True added a comment -

        Pinging to have someone look at patch v2.

        Show
        Kirk True added a comment - Pinging to have someone look at patch v2.
        Kirk True made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Kirk True made changes -
        Attachment trunk-3974v3.txt [ 12533387 ]
        Hide
        Kirk True added a comment -

        Includes more fixes and a unit/integration test to ensure things are working.

        Still needs either CASSANDRA-4299 (removeDeleted clean up) to filter out timed-out columns.

        Show
        Kirk True added a comment - Includes more fixes and a unit/integration test to ensure things are working. Still needs either CASSANDRA-4299 (removeDeleted clean up) to filter out timed-out columns.
        Kirk True made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Sylvain Lebresne added a comment -

        About the removeDeleted problem: I think that trying to force calls to removeDeleted (which force an iteration of all columns) so that we can add the logic of this ticket is the wrong approach (because it's inefficient for no good reason and doesn't make the code easier to follow). I.e. currently the code to ignore irrelevant columns is split between QueryFilter.isRelevant() and removeDeleted depending of which code path is taken (reads use isRelevant and compaction uses removeDeleted basically). So I see mostly 2 options:

        1. we find a way to refactor the code so that we only ever ignore irrelevant columns in one place. That would be great but again it's unclear how to do that correctly.
        2. we put the logic for this patch in both removeDeleted and isRelevant.

        I'm personally fine going the second solution for the purpose of this ticket and keep the first option in mind for later as a way to improve the code base.

        Show
        Sylvain Lebresne added a comment - About the removeDeleted problem: I think that trying to force calls to removeDeleted (which force an iteration of all columns) so that we can add the logic of this ticket is the wrong approach (because it's inefficient for no good reason and doesn't make the code easier to follow). I.e. currently the code to ignore irrelevant columns is split between QueryFilter.isRelevant() and removeDeleted depending of which code path is taken (reads use isRelevant and compaction uses removeDeleted basically). So I see mostly 2 options: we find a way to refactor the code so that we only ever ignore irrelevant columns in one place. That would be great but again it's unclear how to do that correctly. we put the logic for this patch in both removeDeleted and isRelevant. I'm personally fine going the second solution for the purpose of this ticket and keep the first option in mind for later as a way to improve the code base.
        Robert Coli made changes -
        Attachment cassandra.1.0.10.replaying.log.after.exception.during.drain.txt [ 12537073 ]
        Robert Coli made changes -
        Attachment cassandra.1.0.10.replaying.log.after.exception.during.drain.txt [ 12537073 ]
        Hide
        Robert Coli added a comment -

        Sorry for the erroneous attachment, somehow JIRA produced a link on creation of https://issues.apache.org/jira/browse/CASSANDRA-4446 which directed me here and I attached before I noticed it was the wrong ticket.

        Show
        Robert Coli added a comment - Sorry for the erroneous attachment, somehow JIRA produced a link on creation of https://issues.apache.org/jira/browse/CASSANDRA-4446 which directed me here and I attached before I noticed it was the wrong ticket.
        Hide
        Kirk True added a comment -

        Sylvain - the main issue is that the code isn't structured in such a way that a CFMetaData object is available.

        Neither the code for QueryFilter.isRelevant nor its callers have access to a CFMetaData. Can you think of a way to get the CFMetaData in there or a different way to structure the code in general?

        Show
        Kirk True added a comment - Sylvain - the main issue is that the code isn't structured in such a way that a CFMetaData object is available. Neither the code for QueryFilter.isRelevant nor its callers have access to a CFMetaData. Can you think of a way to get the CFMetaData in there or a different way to structure the code in general?
        Kirk True made changes -
        Attachment trunk-3974v4.txt [ 12537719 ]
        Kirk True made changes -
        Status Patch Available [ 10002 ] In Progress [ 3 ]
        Hide
        Kirk True added a comment -

        Updated version which removes the forced calls to removeDeleted and updates QueryFilter.isRelevant (and its callers). Passes all tests in integration test (ColumnFamilyTTLTest) now.

        Show
        Kirk True added a comment - Updated version which removes the forced calls to removeDeleted and updates QueryFilter.isRelevant (and its callers). Passes all tests in integration test (ColumnFamilyTTLTest) now.
        Kirk True made changes -
        Status In Progress [ 3 ] Patch Available [ 10002 ]
        Reviewer slebresne
        Hide
        Sylvain Lebresne added a comment -

        I'm sorry I'm a little late to the discussion, but I'm not sure I'm a fan of using the metadata TTL to decide of expiration because:

        1. It means we use the column timestamp to decide of the expiration. However, we have been very careful so far to not use the column timestamp as a server side timestamp. And in particular, the patch assumes the timestamp is in microseconds, while most clients and CQL actually use microseconds.
        2. Altering the default TTL is imo more confusing that way, because we are pretending that altering the TTL will apply to all existing CF and columns, which itself suggests that if you want to remove everything older than say 1h, you can switch the TTL to 1h and then change it back right away to some other much longer value (or 0). But that's not the case, because the new TTL will only be applied to existing data only when compaction happens. And I really don't think that user visible behaviors should depends in any way on the timing of internal operations.
        3. This requires passing the CFMetadata in lots of places in the code, which isn't really nice. In particular, we should call isColumnExpiredFromDefaultTTL pretty much every time DeletionInfo.isDeleted() is called (after all, having an expired column is exatly the same than having a deleted one), and the current patch is missing quite a few places.

        So I think I do prefer the idea of having the CF TTL just being the default TTL applied to columns when inserted if they don't have one.

        Show
        Sylvain Lebresne added a comment - I'm sorry I'm a little late to the discussion, but I'm not sure I'm a fan of using the metadata TTL to decide of expiration because: It means we use the column timestamp to decide of the expiration. However, we have been very careful so far to not use the column timestamp as a server side timestamp. And in particular, the patch assumes the timestamp is in microseconds, while most clients and CQL actually use microseconds. Altering the default TTL is imo more confusing that way, because we are pretending that altering the TTL will apply to all existing CF and columns, which itself suggests that if you want to remove everything older than say 1h, you can switch the TTL to 1h and then change it back right away to some other much longer value (or 0). But that's not the case, because the new TTL will only be applied to existing data only when compaction happens. And I really don't think that user visible behaviors should depends in any way on the timing of internal operations. This requires passing the CFMetadata in lots of places in the code, which isn't really nice. In particular, we should call isColumnExpiredFromDefaultTTL pretty much every time DeletionInfo.isDeleted() is called (after all, having an expired column is exatly the same than having a deleted one), and the current patch is missing quite a few places. So I think I do prefer the idea of having the CF TTL just being the default TTL applied to columns when inserted if they don't have one.
        Hide
        Jonathan Ellis added a comment -

        If we're just going to have CF TTL being sugar for clients too lazy to apply what they want, then I'm not interested.

        But if we use CF TTL to provide an upper bound on how long data can live, then we open the door for some interesting optimizations.

        Show
        Jonathan Ellis added a comment - If we're just going to have CF TTL being sugar for clients too lazy to apply what they want, then I'm not interested. But if we use CF TTL to provide an upper bound on how long data can live, then we open the door for some interesting optimizations.
        Hide
        Sylvain Lebresne added a comment -

        Well, if the goal is just to be able to drop entire sstables when we know everything is expired, we could compute and keep in the metadata the min TTL of the sstable.

        Show
        Sylvain Lebresne added a comment - Well, if the goal is just to be able to drop entire sstables when we know everything is expired, we could compute and keep in the metadata the min TTL of the sstable.
        Hide
        Jonathan Ellis added a comment -

        Hmm. Now that you mention it, Yuki already added that in CASSANDRA-3442...

        Show
        Jonathan Ellis added a comment - Hmm. Now that you mention it, Yuki already added that in CASSANDRA-3442 ...
        Hide
        Jonathan Ellis added a comment -

        If we're just going to have CF TTL being sugar for clients too lazy to apply what they want, then I'm not interested.

        I guess it would be a good thing to have for CQL though by the same reasoning as CASSANDRA-4448.

        Show
        Jonathan Ellis added a comment - If we're just going to have CF TTL being sugar for clients too lazy to apply what they want, then I'm not interested. I guess it would be a good thing to have for CQL though by the same reasoning as CASSANDRA-4448 .
        Hide
        Sylvain Lebresne added a comment -

        I guess it would be a good thing to have for CQL though by the same reasoning as CASSANDRA-4448.

        Agreed.

        Show
        Sylvain Lebresne added a comment - I guess it would be a good thing to have for CQL though by the same reasoning as CASSANDRA-4448 . Agreed.
        Hide
        Jeremy Hanna added a comment -

        If we're just going to have CF TTL being sugar for clients too lazy to apply what they want, then I'm not interested.

        Also if that client happens to be Pig or Hive, there's not currently a way to set TTLs. So in that case it's not laziness of the client.

        A use case: I don't want to MapReduce over my giant archival column family so when ingesting data, I'll write to my archival column family and in addition a column family with a default TTL or however it's implemented, so it would just be data from the last 30 days.

        Show
        Jeremy Hanna added a comment - If we're just going to have CF TTL being sugar for clients too lazy to apply what they want, then I'm not interested. Also if that client happens to be Pig or Hive, there's not currently a way to set TTLs. So in that case it's not laziness of the client. A use case: I don't want to MapReduce over my giant archival column family so when ingesting data, I'll write to my archival column family and in addition a column family with a default TTL or however it's implemented, so it would just be data from the last 30 days.
        Kirk True made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Kirk True added a comment -

        I can reintroduce the 'TTL as a default' approach from the first patch if that's how we want it to work.

        Show
        Kirk True added a comment - I can reintroduce the 'TTL as a default' approach from the first patch if that's how we want it to work.
        Hide
        Jonathan Ellis added a comment -

        I think Sylvain is right that that makes more sense... sorry about the wild goose chase!

        Show
        Jonathan Ellis added a comment - I think Sylvain is right that that makes more sense... sorry about the wild goose chase!
        Jonathan Ellis made changes -
        Fix Version/s 1.2.0 [ 12323243 ]
        Fix Version/s 1.2.0 beta 1 [ 12319262 ]
        Kirk True made changes -
        Attachment trunk-3974v5.txt [ 12544246 ]
        Hide
        Kirk True added a comment -

        Added v5 of the patch.

        Show
        Kirk True added a comment - Added v5 of the patch.
        Kirk True made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Sylvain Lebresne added a comment -

        Sorry, I've forgot about that one, so the patch needs rebasing. But from a cursory inspection, v5 looks ok (except maybe for the M/R support Jeremy suggested above (but I'm not sure where's the best place to add that)).

        Also, CASSANDRA-3442 adds info that should help use optimize things by dropping fully expired sstables, but I'm not sure the optimization itself is implemented yet. Do we want to do that in this ticket or should we move that to later (I'm good either way)?

        Show
        Sylvain Lebresne added a comment - Sorry, I've forgot about that one, so the patch needs rebasing. But from a cursory inspection, v5 looks ok (except maybe for the M/R support Jeremy suggested above (but I'm not sure where's the best place to add that)). Also, CASSANDRA-3442 adds info that should help use optimize things by dropping fully expired sstables, but I'm not sure the optimization itself is implemented yet. Do we want to do that in this ticket or should we move that to later (I'm good either way)?
        Kirk True made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Kirk True added a comment -

        Version 6, sync'ed with trunk as of this morning.

        Show
        Kirk True added a comment - Version 6, sync'ed with trunk as of this morning.
        Kirk True made changes -
        Attachment trunk-3974v6.txt [ 12548657 ]
        Kirk True made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Sylvain Lebresne made changes -
        Fix Version/s 1.2.0 rc1 [ 12323481 ]
        Fix Version/s 1.2.0 [ 12323243 ]
        Hide
        Kirk True added a comment -

        Pinging for feedback.

        Show
        Kirk True added a comment - Pinging for feedback.
        Hide
        Sylvain Lebresne added a comment -

        I realize I'm reviewer on this one. I seem that remember that Jonathan Ellis wanted to have a look at that but maybe I misunderstood that?

        In any case, I had a look at that patch and that looks good to me overall. That being, and that's not really a criticism of the patch, I do was slightly surprised that we only need to modify ColumnFamily.addColumn(QueryPath, ...) and InsertStatement to make that work. Don't get me wrong, I do think this is correct, but it does feel a bit fragile that some future internal code could too easily add an ExpiringColumn though ColumnFamily.addColumn(IColumn) and skip the global cf setting. I don't really have any good solution to make it less fragile however, I'm just thinking out loud. But that remark aside, again the patch does lgtm (aside from needing rebase).

        Show
        Sylvain Lebresne added a comment - I realize I'm reviewer on this one. I seem that remember that Jonathan Ellis wanted to have a look at that but maybe I misunderstood that? In any case, I had a look at that patch and that looks good to me overall. That being, and that's not really a criticism of the patch, I do was slightly surprised that we only need to modify ColumnFamily.addColumn(QueryPath, ...) and InsertStatement to make that work. Don't get me wrong, I do think this is correct, but it does feel a bit fragile that some future internal code could too easily add an ExpiringColumn though ColumnFamily.addColumn(IColumn) and skip the global cf setting. I don't really have any good solution to make it less fragile however, I'm just thinking out loud. But that remark aside, again the patch does lgtm (aside from needing rebase).
        Hide
        Jonathan Ellis added a comment -

        it does feel a bit fragile that some future internal code could too easily add an ExpiringColumn though ColumnFamily.addColumn(IColumn)

        Tracing and unit tests (via Util.expiringColumn) already use this method.

        Should we make the constructors private and expose a factory method that requires the metadata?

        Show
        Jonathan Ellis added a comment - it does feel a bit fragile that some future internal code could too easily add an ExpiringColumn though ColumnFamily.addColumn(IColumn) Tracing and unit tests (via Util.expiringColumn) already use this method. Should we make the constructors private and expose a factory method that requires the metadata?
        Hide
        Sylvain Lebresne added a comment -

        Should we make the constructors private and expose a factory method that requires the metadata?

        I like that idea.

        Show
        Sylvain Lebresne added a comment - Should we make the constructors private and expose a factory method that requires the metadata? I like that idea.
        Hide
        Kirk True added a comment -

        Rebase against trunk.

        Privatizing the constructors causes a lot of collateral changes and forces the creation of a factor method that is IMO not very intuitive to the caller.

        Show
        Kirk True added a comment - Rebase against trunk. Privatizing the constructors causes a lot of collateral changes and forces the creation of a factor method that is IMO not very intuitive to the caller.
        Kirk True made changes -
        Attachment trunk-3974v7.txt [ 12553240 ]
        Kirk True made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Kirk True made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Jonathan Ellis added a comment -

        If we use ttl <= 0 as a signal to use the default ttl in CF.addColumn, how do we override the default to be "no ttl at all?" Should we treat Integer.MAX_VALUE as "don't use the default, just give me a non-expiring Column?"

        Does UpdateStatement go through addColumn eventually? If so we are duplicating code there. If not that makes me a bigger fan of centralizing this in a factory method. (Guess we can leave the other constructors alone.)

        Show
        Jonathan Ellis added a comment - If we use ttl <= 0 as a signal to use the default ttl in CF.addColumn, how do we override the default to be "no ttl at all?" Should we treat Integer.MAX_VALUE as "don't use the default, just give me a non-expiring Column?" Does UpdateStatement go through addColumn eventually? If so we are duplicating code there. If not that makes me a bigger fan of centralizing this in a factory method. (Guess we can leave the other constructors alone.)
        Hide
        Kirk True added a comment -

        Jonathan, we want the ability for clients to explicitly not use the column family default TTL?

        Show
        Kirk True added a comment - Jonathan, we want the ability for clients to explicitly not use the column family default TTL?
        Hide
        Jonathan Ellis added a comment -

        I guess I thought that was implied but we can apply YAGNI here.

        What about UpdateStatemnt/addColumn?

        Show
        Jonathan Ellis added a comment - I guess I thought that was implied but we can apply YAGNI here. What about UpdateStatemnt/addColumn?
        Hide
        Jonathan Ellis added a comment -

        v8 attached to add Column.create factory and use that from addColumn and UpdateStatement

        Show
        Jonathan Ellis added a comment - v8 attached to add Column.create factory and use that from addColumn and UpdateStatement
        Jonathan Ellis made changes -
        Attachment 3974-v8.txt [ 12553895 ]
        Hide
        Sylvain Lebresne added a comment -

        v8 lgtm.

        Show
        Sylvain Lebresne added a comment - v8 lgtm.
        Hide
        Jonathan Ellis added a comment -

        Committed to 1.2 but lots of conflicts merging to trunk so backed it out.

        Show
        Jonathan Ellis added a comment - Committed to 1.2 but lots of conflicts merging to trunk so backed it out.
        Jonathan Ellis made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Fix Version/s 1.3 [ 12322954 ]
        Fix Version/s 1.2.0 rc1 [ 12323481 ]
        Hide
        Jeremy Hanna added a comment - - edited

        So this isn't going into 1.2 because it didn't apply cleanly to trunk? I'm confused why the status is set to open and the target version is now set to 1.3.

        Show
        Jeremy Hanna added a comment - - edited So this isn't going into 1.2 because it didn't apply cleanly to trunk? I'm confused why the status is set to open and the target version is now set to 1.3.
        Hide
        Jonathan Ellis added a comment -

        Set to open: because patch has been reviewed, no need for it to keep showing up in Patch Available until there is a new one.

        Target version 1.3: because we missed the window to make 1.2.0rc1.

        Show
        Jonathan Ellis added a comment - Set to open: because patch has been reviewed, no need for it to keep showing up in Patch Available until there is a new one. Target version 1.3: because we missed the window to make 1.2.0rc1.
        Hide
        Kirk True added a comment -

        Rebased against trunk.

        Show
        Kirk True added a comment - Rebased against trunk.
        Kirk True made changes -
        Attachment 3974-v9.txt [ 12554756 ]
        Kirk True made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Jonathan Ellis made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Jonathan Ellis added a comment -

        committed, thanks!

        Show
        Jonathan Ellis added a comment - committed, thanks!
        Gavin made changes -
        Workflow no-reopen-closed, patch-avail [ 12655154 ] patch-available, re-open possible [ 12753068 ]
        Gavin made changes -
        Workflow patch-available, re-open possible [ 12753068 ] reopen-resolved, no closed status, patch-avail, testing [ 12755733 ]

          People

          • Assignee:
            Kirk True
            Reporter:
            Jonathan Ellis
            Reviewer:
            Sylvain Lebresne
          • Votes:
            8 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development