Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 2.0.1
    • Component/s: None
    • Labels:

      Description

      SSTableBoundedScanner seeks rather than scanning through rows, so it would be significantly more efficient than the existing per-key filtering that cleanup does.

        Activity

        Hide
        Jonathan Ellis added a comment -

        committed

        Show
        Jonathan Ellis added a comment - committed
        Hide
        Aleksey Yeschenko added a comment -
        Show
        Aleksey Yeschenko added a comment - Jonathan Ellis +1
        Hide
        Jonathan Ellis added a comment -

        patch to respect per-sstable partitioner

        Show
        Jonathan Ellis added a comment - patch to respect per-sstable partitioner
        Hide
        Aleksey Yeschenko added a comment -

        This patch broke at the very least sstable2json for 2i sstables (w/ LocalPartitioner).

        Show
        Aleksey Yeschenko added a comment - This patch broke at the very least sstable2json for 2i sstables (w/ LocalPartitioner).
        Hide
        Marcus Eriksson added a comment -

        committed as 9e846d9ff69f825f6200f7f75fdfc53926bfc255

        Show
        Marcus Eriksson added a comment - committed as 9e846d9ff69f825f6200f7f75fdfc53926bfc255
        Hide
        Marcus Eriksson added a comment -

        looks good to me! i'll get it committed tonight

        Show
        Marcus Eriksson added a comment - looks good to me! i'll get it committed tonight
        Hide
        Tyler Hobbs added a comment -

        With a range-aware scanner, do we still need 5722?

        When there are counters and secondary indexes, a full scan would still need to be performed, so 5722 would let us avoid that case when it's unnecessary.

        Show
        Tyler Hobbs added a comment - With a range-aware scanner, do we still need 5722? When there are counters and secondary indexes, a full scan would still need to be performed, so 5722 would let us avoid that case when it's unnecessary.
        Hide
        Jonathan Ellis added a comment -

        With a range-aware scanner, do we still need 5722?

        Show
        Jonathan Ellis added a comment - With a range-aware scanner, do we still need 5722?
        Hide
        Tyler Hobbs added a comment -

        Patch 2524-v1.txt (and branch) builds on Marcus's work and adds multiple range support to SSTableScanner. There didn't end up being much overlap with the work on CASSANDRA-5722.

        Show
        Tyler Hobbs added a comment - Patch 2524-v1.txt (and branch ) builds on Marcus's work and adds multiple range support to SSTableScanner. There didn't end up being much overlap with the work on CASSANDRA-5722 .
        Hide
        Jonathan Ellis added a comment -

        Assigning to Tyler since his work on CASSANDRA-5722 gets us most of the way there.

        Show
        Jonathan Ellis added a comment - Assigning to Tyler since his work on CASSANDRA-5722 gets us most of the way there.
        Hide
        Jonathan Ellis added a comment -

        I think the analog would be to create a SSTS with a Range.

        Show
        Jonathan Ellis added a comment - I think the analog would be to create a SSTS with a Range.
        Hide
        Marcus Eriksson added a comment - - edited

        Jonathan Ellis this should definetly be deprecated after CASSANDRA-4180 though, right? (SSTBS being gone and all)

        Show
        Marcus Eriksson added a comment - - edited Jonathan Ellis this should definetly be deprecated after CASSANDRA-4180 though, right? (SSTBS being gone and all)
        Hide
        Marcus Eriksson added a comment -

        hmm in that case the explicit cleanup should perhaps be removed? Have to say I do like having the option to reclaim space directly, not having to wait for compaction (and when i do cleanup, it would be nice if it was quick )

        Show
        Marcus Eriksson added a comment - hmm in that case the explicit cleanup should perhaps be removed? Have to say I do like having the option to reclaim space directly, not having to wait for compaction (and when i do cleanup, it would be nice if it was quick )
        Hide
        Vijay added a comment -

        Marcus, I think jonathan is talking about the use of this optimization, since users might not need to run cleanup after 5051.

        Show
        Vijay added a comment - Marcus, I think jonathan is talking about the use of this optimization, since users might not need to run cleanup after 5051.
        Hide
        Marcus Eriksson added a comment -

        Vijay what do you think?

        Show
        Marcus Eriksson added a comment - Vijay what do you think?
        Hide
        Marcus Eriksson added a comment -

        hmm i dont see it replacing the scanner in doCleanupCompaction, so this would still be valid right?

        Show
        Marcus Eriksson added a comment - hmm i dont see it replacing the scanner in doCleanupCompaction, so this would still be valid right?
        Hide
        Jonathan Ellis added a comment -

        Hmm, I definitely meant to comment on this earlier... I think this gets obsoleted by CASSANDRA-5051?

        Show
        Jonathan Ellis added a comment - Hmm, I definitely meant to comment on this earlier... I think this gets obsoleted by CASSANDRA-5051 ?
        Hide
        Marcus Eriksson added a comment -

        oops, db.RowMutation should not be in the patch

        Show
        Marcus Eriksson added a comment - oops, db.RowMutation should not be in the patch
        Hide
        Marcus Eriksson added a comment -

        mostly an update to the last patch by Stu Hood and adding a background task to clean up the row cache

        Show
        Marcus Eriksson added a comment - mostly an update to the last patch by Stu Hood and adding a background task to clean up the row cache
        Hide
        Jonathan Ellis added a comment -

        (SSTBS has been replaced by a SSTS constructor taking a range)

        Show
        Jonathan Ellis added a comment - (SSTBS has been replaced by a SSTS constructor taking a range)
        Hide
        Sylvain Lebresne added a comment -

        This looks good to me, though one thing that goes away with that is that for cf that use the BoundedScanner, we don't cleanup the row cache. This is not a really big deal, but it's perfectly possible that following the cleanup, rows that are still on the node get evicted from cache before rows that are not in the ranges of the node, so it's a slight regression. Maybe we could simply add a cache cleanup that iterate through the cache and only keeps keys in range? Though one may argue it's on the fringe of this ticket, I'd prefer we add this here so we commit something that is an improvement, rather than an improvement with a slight regression to be fixed later.

        Show
        Sylvain Lebresne added a comment - This looks good to me, though one thing that goes away with that is that for cf that use the BoundedScanner, we don't cleanup the row cache. This is not a really big deal, but it's perfectly possible that following the cleanup, rows that are still on the node get evicted from cache before rows that are not in the ranges of the node, so it's a slight regression. Maybe we could simply add a cache cleanup that iterate through the cache and only keeps keys in range? Though one may argue it's on the fringe of this ticket, I'd prefer we add this here so we commit something that is an improvement, rather than an improvement with a slight regression to be fixed later.
        Hide
        Stu Hood added a comment -

        Oops... secondary indexes and counters actually still require a full scan to stay consistent after cleanup. 0002 Adds CleanupStrategy.

        Show
        Stu Hood added a comment - Oops... secondary indexes and counters actually still require a full scan to stay consistent after cleanup. 0002 Adds CleanupStrategy.
        Hide
        Stu Hood added a comment -

        Very, very simple change to use SSTableBoundedScanner during cleanup. The bulk of the patch is test changes.

        Show
        Stu Hood added a comment - Very, very simple change to use SSTableBoundedScanner during cleanup. The bulk of the patch is test changes.

          People

          • Assignee:
            Tyler Hobbs
            Reporter:
            Stu Hood
            Reviewer:
            Marcus Eriksson
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development