Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 3.10
    • Component/s: Compaction, Tools
    • Labels:
      None

      Description

      A tool like cassandra-stress that works with stress yaml but:

      • writes directly to a specified dir using CQLSSTableWriter.
      • lets you run just compaction on that directory and generates a report on compaction throughput.

        Activity

        Hide
        tjake T Jake Luciani added a comment -

        Just pushed a fix 1d51512effadf57c0f88a17ca67cbed015e2aa99

        Show
        tjake T Jake Luciani added a comment - Just pushed a fix 1d51512effadf57c0f88a17ca67cbed015e2aa99
        Hide
        jkni Joel Knighton added a comment -

        This is also failing on all trunk_testall CI runs since this commit.

        Show
        jkni Joel Knighton added a comment - This is also failing on all trunk_testall CI runs since this commit.
        Hide
        aweisberg Ariel Weisberg added a comment -
        Show
        aweisberg Ariel Weisberg added a comment - This commit causes https://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-12358-trunk-testall/1/testReport/junit/org.apache.cassandra.io.sstable/CQLSSTableWriterTest/testUnsetValues/ to hard fail near as I can tell. I bisected and reverted this change and the test started passing.
        Hide
        tjake T Jake Luciani added a comment -

        Committed 47d3b7e7a013b485a2906fc7f0f2fc90e1143966

        Show
        tjake T Jake Luciani added a comment - Committed 47d3b7e7a013b485a2906fc7f0f2fc90e1143966
        Hide
        krummas Marcus Eriksson added a comment - - edited

        Compaction is range aware (ie, if a token is in the wrong directory, it will get written to the correct one)

        This looks good to me now with a small comment/help change - RangeAwareSSTableWriter does not write one sstable per vnode range, it splits the local ranges in number of data directories and makes sure we never write the same token in two different directories.

        Show
        krummas Marcus Eriksson added a comment - - edited Compaction is range aware (ie, if a token is in the wrong directory, it will get written to the correct one) This looks good to me now with a small comment/help change - RangeAwareSSTableWriter does not write one sstable per vnode range, it splits the local ranges in number of data directories and makes sure we never write the same token in two different directories.
        Hide
        tjake T Jake Luciani added a comment -

        Marcus Eriksson The only code that uses the RangeAwareSSTableWriter is StreamWriter. I had assumed the compactor would use the RangeAwareWriter not the flusher.

        I added it as a command line flag since you probably want to test both migrating to range aware and normal range aware.

        Show
        tjake T Jake Luciani added a comment - Marcus Eriksson The only code that uses the RangeAwareSSTableWriter is StreamWriter. I had assumed the compactor would use the RangeAwareWriter not the flusher. I added it as a command line flag since you probably want to test both migrating to range aware and normal range aware.
        Hide
        krummas Marcus Eriksson added a comment -

        Seems it does not split the ranges properly over the data directories when writing currently, we probably need to use RangeAwareSSTableWriter in AbstractSSTableSimpleWriter (it is imported, but not used)

        Show
        krummas Marcus Eriksson added a comment - Seems it does not split the ranges properly over the data directories when writing currently, we probably need to use RangeAwareSSTableWriter in AbstractSSTableSimpleWriter (it is imported, but not used)
        Hide
        tjake T Jake Luciani added a comment -

        Slight changes added to maintain support for CASSANDRA-8671

        Show
        tjake T Jake Luciani added a comment - Slight changes added to maintain support for CASSANDRA-8671
        Hide
        tjake T Jake Luciani added a comment -

        I pushed more commits to allow you to test 10540 or any strategy. This works by creating an offline CFS object and uses the internal Directories and CompactionStrategy to write to disk/compact.

        I need to fix some more tests but should be enough to look at / test

        Show
        tjake T Jake Luciani added a comment - I pushed more commits to allow you to test 10540 or any strategy. This works by creating an offline CFS object and uses the internal Directories and CompactionStrategy to write to disk/compact. I need to fix some more tests but should be enough to look at / test
        Hide
        krummas Marcus Eriksson added a comment -

        I'm planning to use this to stress test CASSANDRA-10540 - I'm sure I'll have some feedback while doing that

        Show
        krummas Marcus Eriksson added a comment - I'm planning to use this to stress test CASSANDRA-10540 - I'm sure I'll have some feedback while doing that
        Hide
        tjake T Jake Luciani added a comment - - edited

        pushed https://github.com/tjake/cassandra/tree/compaction-stress

        testall
        dtest

        Example use (see help for all options):

        #write 5g of sstables using 4 writers
        ./tools/bin/compaction-stress write -d /tmp/compaction -g 5 -p https://gist.githubusercontent.com/tjake/8995058fed11d9921e31/raw/a9334d1090017bf546d003e271747351a40692ea/blogpost.yaml -t 4
        
        #Compact the data using 4 compactors
        ./bin/compaction-stress compact -d /tmp/compaction -p https://gist.githubusercontent.com/tjake/8995058fed11d9921e31/raw/a9334d1090017bf546d003e271747351a40692ea/blogpost.yaml -t 4
        

        The output of the compact command, besides stdout, is the compaction.log from CASSANDRA-10805. I think we should extend the compaction log to include more information like row/partition data.

        /cc for input Marcus Eriksson Paulo Motta

        Show
        tjake T Jake Luciani added a comment - - edited pushed https://github.com/tjake/cassandra/tree/compaction-stress testall dtest Example use (see help for all options): #write 5g of sstables using 4 writers ./tools/bin/compaction-stress write -d /tmp/compaction -g 5 -p https: //gist.githubusercontent.com/tjake/8995058fed11d9921e31/raw/a9334d1090017bf546d003e271747351a40692ea/blogpost.yaml -t 4 #Compact the data using 4 compactors ./bin/compaction-stress compact -d /tmp/compaction -p https: //gist.githubusercontent.com/tjake/8995058fed11d9921e31/raw/a9334d1090017bf546d003e271747351a40692ea/blogpost.yaml -t 4 The output of the compact command, besides stdout, is the compaction.log from CASSANDRA-10805 . I think we should extend the compaction log to include more information like row/partition data. /cc for input Marcus Eriksson Paulo Motta

          People

          • Assignee:
            tjake T Jake Luciani
            Reporter:
            tjake T Jake Luciani
            Reviewer:
            Marcus Eriksson
          • Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development