Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-11179

Parallel cleanup can lead to disk space exhaustion

Details

    Description

      In CASSANDRA-5547, we made cleanup (among other things) run in parallel across multiple sstables. There have been reports on IRC of this leading to disk space exhaustion, because multiple sstables are (almost entirely) rewritten at the same time. This seems particularly problematic because cleanup is frequently run after a cluster is expanded due to low disk space.

      I'm not really familiar with how we perform free disk space checks now, but it sounds like we can make some improvements here. It would be good to reduce the concurrency of cleanup operations if there isn't enough free disk space to support this.

      Attachments

        Activity

          jjirsa Jeff Jirsa added a comment - - edited

          Also true of scrub.

          One other side-effect of parallelization worth noting is that source files are not immediately freed upon completion of each individual sstable - if you have n concurrent compactors, and 1 sstable is significantly smaller than the others, it will be finished very quickly, but there will exist a significant period of time when both the original source and resulting cleaned sstable will co-exist on disk (until all n are done?).

          That is, it appears that current parallel code waits for all in-flight tasks to complete before finalizing, and because those tasks run at different speed, operators are that much more likely to run out of disk during cleanup.

          jjirsa Jeff Jirsa added a comment - - edited Also true of scrub. One other side-effect of parallelization worth noting is that source files are not immediately freed upon completion of each individual sstable - if you have n concurrent compactors, and 1 sstable is significantly smaller than the others, it will be finished very quickly, but there will exist a significant period of time when both the original source and resulting cleaned sstable will co-exist on disk (until all n are done?). That is, it appears that current parallel code waits for all in-flight tasks to complete before finalizing, and because those tasks run at different speed, operators are that much more likely to run out of disk during cleanup.
          tjake T Jake Luciani added a comment -

          Looks like the cleanup issue is we aren't clearing the transaction early in all cases so it's held till the end of the compaction.

          branch 3.0
          tests
          dtest

          tjake T Jake Luciani added a comment - Looks like the cleanup issue is we aren't clearing the transaction early in all cases so it's held till the end of the compaction. branch 3.0 tests dtest

          I don't think that is the problem, the rewriter should already be making that call in writer.finish() (https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L367-L368)

          marcuse Marcus Eriksson added a comment - I don't think that is the problem, the rewriter should already be making that call in writer.finish() ( https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L367-L368 )

          Been testing this a bit and I don't think we have any problem with cleanup not removing sstables during the operation

          I ran this: https://github.com/krummas/cassandra-dtest/commits/monitor (I will convert to proper dtest)
          and got this output on 2.1:

          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-2-Data.db
          /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db
          ----------------
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
          /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-7-Data.db
          ----------------
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
          /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-8-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db
          ----------------
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db
          /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
          ----------------
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db
          /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
          ----------------
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
          /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-10-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
          ----------------
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
          /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db
          /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-10-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
          /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
          ----------------
          

          That is, only writing to a single file with a single compactor, and the old file is gone once the tmp file disappears.

          On 3.0 I get this:

          /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db
          ----------------
          /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db
          ----------------
          /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db
          ----------------
          /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
          ----------------
          /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
          ----------------
          /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db
          /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
          ----------------
          

          Filecount never goes above #original_files + 1 with one compactor.

          So, this issue is probably down to the fact that people might have 8 concurrent compactors and then we will quickly use more diskspace.

          marcuse Marcus Eriksson added a comment - Been testing this a bit and I don't think we have any problem with cleanup not removing sstables during the operation I ran this: https://github.com/krummas/cassandra-dtest/commits/monitor (I will convert to proper dtest) and got this output on 2.1: /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-2-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-7-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-10-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db ---------------- /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db /tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db /tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-10-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db /tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db ---------------- That is, only writing to a single file with a single compactor, and the old file is gone once the tmp file disappears. On 3.0 I get this: /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db ---------------- /tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db /tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db /tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db ---------------- Filecount never goes above #original_files + 1 with one compactor. So, this issue is probably down to the fact that people might have 8 concurrent compactors and then we will quickly use more diskspace.

          patch to add an --seq option to scrub/upgradesstables/cleanup to only use a single thread for the operation

          carlyeks to review since we were poking this in CASSANDRA-10829

          marcuse Marcus Eriksson added a comment - patch to add an --seq option to scrub/upgradesstables/cleanup to only use a single thread for the operation branch testall dtest marcuse/11179 testall dtest marcuse/11179-2.2 testall dtest marcuse/11179-3.0 testall dtest marcuse/11179-3.5 testall dtest marcuse/11179-trunk testall dtest carlyeks to review since we were poking this in CASSANDRA-10829

          ... working on the failing tests

          marcuse Marcus Eriksson added a comment - ... working on the failing tests
          tjake T Jake Luciani added a comment -

          It wouldn't be much of a change so I think you should make this an int vs a boolean. So you can constrain from 1-N of these at a time. Just block on N futures per iteration. right now it's 1 or ALL.

          tjake T Jake Luciani added a comment - It wouldn't be much of a change so I think you should make this an int vs a boolean. So you can constrain from 1-N of these at a time. Just block on N futures per iteration. right now it's 1 or ALL.

          updated to use --jobs X or -j X and make it default to 2 threads

          marcuse Marcus Eriksson added a comment - updated to use --jobs X or -j X and make it default to 2 threads
          thobbs Tom Hobbs added a comment -

          +1 on defaulting to 2 threads. I like having the default be fairly safe.

          thobbs Tom Hobbs added a comment - +1 on defaulting to 2 threads. I like having the default be fairly safe.
          carlyeks Carl Yeksigian added a comment -

          Looks good. Just a couple of comments:

          • Would be nice to add a comment to parallelAllSSTableOperation explaining that jobs = 0 means using all compactor threads, so that we remember to propagate that to our argument explanations.
          • Also, it's not clear what would happen if you specified a jobs higher than the number of concurrent compactors. The expectation is probably that it would override that selection, so either a warning or the inability to do that would be helpful.
          carlyeks Carl Yeksigian added a comment - Looks good. Just a couple of comments: Would be nice to add a comment to parallelAllSSTableOperation explaining that jobs = 0 means using all compactor threads, so that we remember to propagate that to our argument explanations. Also, it's not clear what would happen if you specified a jobs higher than the number of concurrent compactors. The expectation is probably that it would override that selection, so either a warning or the inability to do that would be helpful.

          Rebased and pushed a new commit with the comments fixed to the repos above (It outputs a message if -j > concurrent_compactors)

          marcuse Marcus Eriksson added a comment - Rebased and pushed a new commit with the comments fixed to the repos above (It outputs a message if -j > concurrent_compactors)
          carlyeks Carl Yeksigian added a comment -

          +1

          carlyeks Carl Yeksigian added a comment - +1

          committed, thanks

          marcuse Marcus Eriksson added a comment - committed, thanks

          People

            marcuse Marcus Eriksson
            thobbs Tom Hobbs
            Marcus Eriksson
            Carl Yeksigian
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: