Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11179

Parallel cleanup can lead to disk space exhaustion

    Details

      Description

      In CASSANDRA-5547, we made cleanup (among other things) run in parallel across multiple sstables. There have been reports on IRC of this leading to disk space exhaustion, because multiple sstables are (almost entirely) rewritten at the same time. This seems particularly problematic because cleanup is frequently run after a cluster is expanded due to low disk space.

      I'm not really familiar with how we perform free disk space checks now, but it sounds like we can make some improvements here. It would be good to reduce the concurrency of cleanup operations if there isn't enough free disk space to support this.

        Activity

        Hide
        jjirsa Jeff Jirsa added a comment - - edited

        Also true of scrub.

        One other side-effect of parallelization worth noting is that source files are not immediately freed upon completion of each individual sstable - if you have n concurrent compactors, and 1 sstable is significantly smaller than the others, it will be finished very quickly, but there will exist a significant period of time when both the original source and resulting cleaned sstable will co-exist on disk (until all n are done?).

        That is, it appears that current parallel code waits for all in-flight tasks to complete before finalizing, and because those tasks run at different speed, operators are that much more likely to run out of disk during cleanup.

        Show
        jjirsa Jeff Jirsa added a comment - - edited Also true of scrub. One other side-effect of parallelization worth noting is that source files are not immediately freed upon completion of each individual sstable - if you have n concurrent compactors, and 1 sstable is significantly smaller than the others, it will be finished very quickly, but there will exist a significant period of time when both the original source and resulting cleaned sstable will co-exist on disk (until all n are done?). That is, it appears that current parallel code waits for all in-flight tasks to complete before finalizing, and because those tasks run at different speed, operators are that much more likely to run out of disk during cleanup.
        Hide
        tjake T Jake Luciani added a comment -

        Looks like the cleanup issue is we aren't clearing the transaction early in all cases so it's held till the end of the compaction.

        branch 3.0
        tests
        dtest

        Show
        tjake T Jake Luciani added a comment - Looks like the cleanup issue is we aren't clearing the transaction early in all cases so it's held till the end of the compaction. branch 3.0 tests dtest
        Hide
        krummas Marcus Eriksson added a comment -

        I don't think that is the problem, the rewriter should already be making that call in writer.finish() (https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L367-L368)

        Show
        krummas Marcus Eriksson added a comment - I don't think that is the problem, the rewriter should already be making that call in writer.finish() ( https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L367-L368 )
        Hide
        krummas Marcus Eriksson added a comment -

        Been testing this a bit and I don't think we have any problem with cleanup not removing sstables during the operation

        I ran this: https://github.com/krummas/cassandra-dtest/commits/monitor (I will convert to proper dtest)
        and got this output on 2.1:

        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-2-Data.db
        /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db
        ----------------
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
        /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-7-Data.db
        ----------------
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
        /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-8-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db
        ----------------
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db
        /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
        ----------------
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db
        /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
        ----------------
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
        /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-10-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
        ----------------
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
        /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db
        /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-10-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
        /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
        ----------------
        

        That is, only writing to a single file with a single compactor, and the old file is gone once the tmp file disappears.

        On 3.0 I get this:

        /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db
        ----------------
        /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db
        ----------------
        /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db
        ----------------
        /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
        ----------------
        /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
        ----------------
        /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db
        /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
        ----------------
        

        Filecount never goes above #original_files + 1 with one compactor.

        So, this issue is probably down to the fact that people might have 8 concurrent compactors and then we will quickly use more diskspace.

        Show
        krummas Marcus Eriksson added a comment - Been testing this a bit and I don't think we have any problem with cleanup not removing sstables during the operation I ran this: https://github.com/krummas/cassandra-dtest/commits/monitor (I will convert to proper dtest) and got this output on 2.1: /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-2-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-7-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-10-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-10-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db ---------------- That is, only writing to a single file with a single compactor, and the old file is gone once the tmp file disappears. On 3.0 I get this: /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db ---------------- Filecount never goes above #original_files + 1 with one compactor. So, this issue is probably down to the fact that people might have 8 concurrent compactors and then we will quickly use more diskspace.
        Hide
        krummas Marcus Eriksson added a comment -

        patch to add an --seq option to scrub/upgradesstables/cleanup to only use a single thread for the operation

        branch testall dtest
        marcuse/11179 testall dtest
        marcuse/11179-2.2 testall dtest
        marcuse/11179-3.0 testall dtest
        marcuse/11179-3.5 testall dtest
        marcuse/11179-trunk testall dtest

        Carl Yeksigian to review since we were poking this in CASSANDRA-10829

        Show
        krummas Marcus Eriksson added a comment - patch to add an --seq option to scrub/upgradesstables/cleanup to only use a single thread for the operation branch testall dtest marcuse/11179 testall dtest marcuse/11179-2.2 testall dtest marcuse/11179-3.0 testall dtest marcuse/11179-3.5 testall dtest marcuse/11179-trunk testall dtest Carl Yeksigian to review since we were poking this in CASSANDRA-10829
        Hide
        krummas Marcus Eriksson added a comment -

        ... working on the failing tests

        Show
        krummas Marcus Eriksson added a comment - ... working on the failing tests
        Hide
        tjake T Jake Luciani added a comment -

        It wouldn't be much of a change so I think you should make this an int vs a boolean. So you can constrain from 1-N of these at a time. Just block on N futures per iteration. right now it's 1 or ALL.

        Show
        tjake T Jake Luciani added a comment - It wouldn't be much of a change so I think you should make this an int vs a boolean. So you can constrain from 1-N of these at a time. Just block on N futures per iteration. right now it's 1 or ALL.
        Hide
        krummas Marcus Eriksson added a comment -

        updated to use --jobs X or -j X and make it default to 2 threads

        Show
        krummas Marcus Eriksson added a comment - updated to use --jobs X or -j X and make it default to 2 threads
        Hide
        thobbs Tyler Hobbs added a comment -

        +1 on defaulting to 2 threads. I like having the default be fairly safe.

        Show
        thobbs Tyler Hobbs added a comment - +1 on defaulting to 2 threads. I like having the default be fairly safe.
        Hide
        carlyeks Carl Yeksigian added a comment -

        Looks good. Just a couple of comments:

        • Would be nice to add a comment to parallelAllSSTableOperation explaining that jobs = 0 means using all compactor threads, so that we remember to propagate that to our argument explanations.
        • Also, it's not clear what would happen if you specified a jobs higher than the number of concurrent compactors. The expectation is probably that it would override that selection, so either a warning or the inability to do that would be helpful.
        Show
        carlyeks Carl Yeksigian added a comment - Looks good. Just a couple of comments: Would be nice to add a comment to parallelAllSSTableOperation explaining that jobs = 0 means using all compactor threads, so that we remember to propagate that to our argument explanations. Also, it's not clear what would happen if you specified a jobs higher than the number of concurrent compactors. The expectation is probably that it would override that selection, so either a warning or the inability to do that would be helpful.
        Hide
        krummas Marcus Eriksson added a comment -

        Rebased and pushed a new commit with the comments fixed to the repos above (It outputs a message if -j > concurrent_compactors)

        Show
        krummas Marcus Eriksson added a comment - Rebased and pushed a new commit with the comments fixed to the repos above (It outputs a message if -j > concurrent_compactors)
        Hide
        carlyeks Carl Yeksigian added a comment -

        +1

        Show
        carlyeks Carl Yeksigian added a comment - +1
        Hide
        krummas Marcus Eriksson added a comment -

        committed, thanks

        Show
        krummas Marcus Eriksson added a comment - committed, thanks

          People

          • Assignee:
            krummas Marcus Eriksson
            Reporter:
            thobbs Tyler Hobbs
            Reviewer:
            Carl Yeksigian
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development