Cassandra
  1. Cassandra
  2. CASSANDRA-4310

Multiple independent Level Compactions in Parallel

    Details

      Description

      Problem: If you are inserting data into cassandra and level compaction cannot catchup, you will create lot of files in L0.

      Here is a solution which will help here and also increase the performance of level compaction.

      We can do many compactions in parallel for unrelated data.
      1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions in other levels like L2 and L3 if they are eligible.
      2) We can also do compactions with files in L1 which are not participating in L0 compactions.

      This is specially useful if you are using SSD and is not bottlenecked by IO.

      I am seeing this issue in my cluster. The compactions pending are more than 50k and the disk usage is not that much(I am using SSD).
      I am doing multithreaded to true and also not throttling the IO by putting the value as 0.

      1. 4310-v6.txt
        55 kB
        Yuki Morishita
      2. 4310-v5.txt
        54 kB
        Jonathan Ellis
      3. 4310-v3.txt
        33 kB
        Yuki Morishita
      4. 4310-v2.txt
        27 kB
        Yuki Morishita
      5. 4310.txt
        25 kB
        Yuki Morishita

        Activity

        sankalp kohli created issue -
        sankalp kohli made changes -
        Field Original Value New Value
        Description Problem: If you are inserting data into cassandra and level compaction cannot catchup, you will create lot of files in L0. This will starve compactions in the lower levels and make things worse.

        There is a comment about this problem in the code as well.

        Here is a solution which will help here and also increase the performance of level compaction.

        We can do many compactions in parallel for unrelated data.
        1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions in other levels like L2 and L3 if they are eligible.
        2) We can also do compactions with files in L1 which are not participating in L0 compactions.

        This is specially useful if you are using SSD and is not bottlenecked by IO.

        I am seeing this issue in my cluster. The compactions pending are more than 50k.
        I am doing multithreaded to true and also not throttling the IO by putting the value as 0.

         

        Problem: If you are inserting data into cassandra and level compaction cannot catchup, you will create lot of files in L0.


        Here is a solution which will help here and also increase the performance of level compaction.

        We can do many compactions in parallel for unrelated data.
        1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions in other levels like L2 and L3 if they are eligible.
        2) We can also do compactions with files in L1 which are not participating in L0 compactions.

        This is specially useful if you are using SSD and is not bottlenecked by IO.

        I am seeing this issue in my cluster. The compactions pending are more than 50k.
        I am doing multithreaded to true and also not throttling the IO by putting the value as 0.

         

        sankalp kohli made changes -
        Affects Version/s 1.1.1 [ 12319857 ]
        Component/s Core [ 12312978 ]
        sankalp kohli made changes -
        Description Problem: If you are inserting data into cassandra and level compaction cannot catchup, you will create lot of files in L0.


        Here is a solution which will help here and also increase the performance of level compaction.

        We can do many compactions in parallel for unrelated data.
        1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions in other levels like L2 and L3 if they are eligible.
        2) We can also do compactions with files in L1 which are not participating in L0 compactions.

        This is specially useful if you are using SSD and is not bottlenecked by IO.

        I am seeing this issue in my cluster. The compactions pending are more than 50k.
        I am doing multithreaded to true and also not throttling the IO by putting the value as 0.

         

        Problem: If you are inserting data into cassandra and level compaction cannot catchup, you will create lot of files in L0.


        Here is a solution which will help here and also increase the performance of level compaction.

        We can do many compactions in parallel for unrelated data.
        1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions in other levels like L2 and L3 if they are eligible.
        2) We can also do compactions with files in L1 which are not participating in L0 compactions.

        This is specially useful if you are using SSD and is not bottlenecked by IO.

        I am seeing this issue in my cluster. The compactions pending are more than 50k and the disk usage is not that much(I am using SSD).
        I am doing multithreaded to true and also not throttling the IO by putting the value as 0.

         

        sankalp kohli made changes -
        Summary Make Level Compaction go faster(multiple independant compactions in parallel) for insert heavy workload Multiple independent Level Compactions in Parallel.
        sankalp kohli made changes -
        Labels compaction leveled compaction leveled ssd
        sankalp kohli made changes -
        Issue Type Improvement [ 4 ] New Feature [ 2 ]
        sankalp kohli made changes -
        Priority Minor [ 4 ] Major [ 3 ]
        sankalp kohli made changes -
        Summary Multiple independent Level Compactions in Parallel. Multiple independent Level Compactions in Parallel(Useful for SSD).
        sankalp kohli made changes -
        Priority Major [ 3 ] Minor [ 4 ]
        sankalp kohli made changes -
        Affects Version/s 1.1.2 [ 12321445 ]
        sankalp kohli made changes -
        Labels compaction leveled ssd compaction features leveled performance ssd
        sankalp kohli made changes -
        Priority Minor [ 4 ] Major [ 3 ]
        Jonathan Ellis made changes -
        Summary Multiple independent Level Compactions in Parallel(Useful for SSD). Multiple independent Level Compactions in Parallel
        Assignee Yuki Morishita [ yukim ]
        Fix Version/s 1.2 [ 12319262 ]
        Affects Version/s 1.0.0 [ 12316349 ]
        Affects Version/s 1.1.1 [ 12319857 ]
        Affects Version/s 1.1.2 [ 12321445 ]
        Jonathan Ellis made changes -
        Fix Version/s 1.2.1 [ 12322953 ]
        Fix Version/s 1.2.0 [ 12319262 ]
        Yuki Morishita made changes -
        Attachment 4310.txt [ 12545930 ]
        Yuki Morishita made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        Jonathan Ellis made changes -
        Comment [ bq. I introduced max_concurrent_tasks as new compaction option

        Why not just use existing {{concurrent_compactors}}? ]
        Yuki Morishita made changes -
        Attachment 4310-v2.txt [ 12546906 ]
        Yuki Morishita made changes -
        Status In Progress [ 3 ] Patch Available [ 10002 ]
        Yuki Morishita made changes -
        Attachment 4310-v3.txt [ 12548253 ]
        Jonathan Ellis made changes -
        Attachment 4310-v5.txt [ 12548482 ]
        Yuki Morishita made changes -
        Attachment 4310-v6.txt [ 12548614 ]
        Yuki Morishita made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Reviewer jbellis
        Fix Version/s 1.2.0 beta 2 [ 12323284 ]
        Fix Version/s 1.2.1 [ 12322953 ]
        Resolution Fixed [ 1 ]
        Gavin made changes -
        Workflow no-reopen-closed, patch-avail [ 12672179 ] patch-available, re-open possible [ 12753352 ]
        Gavin made changes -
        Workflow patch-available, re-open possible [ 12753352 ] reopen-resolved, no closed status, patch-avail, testing [ 12756185 ]

          People

          • Assignee:
            Yuki Morishita
            Reporter:
            sankalp kohli
            Reviewer:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development